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Preface 



This volume comprises the papers presented at the Third International Andrei 
Ershov Memorial Conference “Perspectives of System Informatics” , Akademgo- 
rodok (Novosibirsk, Russia), July 6-9, 1999. The main goal of the conference 
was to give an overview of research directions which are decisive for the growth 
of major areas of research activities in system informatics. 

The conference was the third one in the line. The first and second interna- 
tional conferences “Perspectives of System Informatics” were held in Novosibirsk, 
Akademgorodok, in May, 1991, and June, 1996, respectively. Both conferences 
gathered a wide spectrum of specialists and were undoubtedly very successful. 

The third conference included many of the subjects of the second conference, 
such as theoretical computer science, programming methodology, new informa- 
tion technologies, and the promising field of artificial intelligence — as important 
components of system informatics. The style of the second conference was pre- 
served to a certain extent in that there were a considerable number of invited 
papers in addition to the contributed papers. However, posters were replaced by 
short talks mainly given by young researchers. 

This time 73 papers were submitted to the conference by researchers from 
all continents. Each paper was reviewed by three experts, at least two of them 
from the same or a closely related discipline as the authors. The reviewers gen- 
erally provided high quality assessments of the papers and often gave extensive 
comments to the authors for the possible improvement of the presentations. As 
a result, the program committee selected 27 high quality papers as regular talks 
and 17 papers as short talks. A broad range of “hot” topics in system informatics 
were covered by eight invited talks given by prominent computer scientists from 
different countries. 

The conference, like the previous ones, was dedicated to the memory of 
A. P. Ershov, the real and recognized leader in Soviet (and Russian) informatics. 

The late Academician Andrei P. Ershov was a man for all seasons. He com- 
manded universal respect and received affection all over the world. His view of 
programming was both a human one and a scientific one. At Akademgorodok 
he created a unique group of scientists — some now in far away regions of the 
world: a good example of “technology transfer” , although perhaps not one that 
too many people in Russia are happy about. 

Many of his disciples and colleagues continue to work in the directions initi- 
ated or stimulated by him, at the A. P. Ershov Institute of Informatics Systems. 
The institute was the main organizer of the three conferences. 
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We are glad to express our gratitude to all the persons and organizations who 
contributed to the conference — to the sponsors for their moral, financial, and 
organizational support, and to the members of the local organizing committee for 
their mutual efforts towards the success of this event. We are especially grateful 
to N. Cheremnykh for her selfless labour when preparing the conference. 



October, 1999 D. Bjprner, 

M. Broy, 
A. Zamulin 




Organization 



Conference Chair: Alexander Marchuk (Novosibirsk, Russia) 

Program Committee Co-chairs: Dines Bj0rner (Lyngby, Denmark) 

Manfred Broy (Munich, Germany) 
Alexandre Zamulin (Novosibirsk, Russia) 

Program Committee: 

Janis Barzdins (Latvia) Gennady Osipov (Russia) 

Frederic Benhamou (France) Jaan Penjam (Estonia) 

Christian Boitet (France) Peter Pepper (Germany) 

Mikhail Bulyonkov (Russia) Igor Pottosin (Russia) 

Piotr Dembinski (Poland) Wolfgang Reisig (Germany) 

Alexander Dikovsky (France) Dieter Rombach (Germany) 

Victor Ivannikov (Russia) Dean Rosenzweig (Croatia) 

Philippe Jorrand (France) Viktor Sabelfeld (Germany) 

Leonid Kalinichenko (Russia) Vladimir Sazonov (Russia) 

Alexander Kleschev (Russia) David Schmidt (USA) 

Vadim Kotov (USA) Sibylle Schupp (USA) 

Reino Kurki-Suonio (Finland) Valery Sokolov (Russia) 

Alexander Letichevski (Ukraine) Nicolas Spyratos (France) 

Eduard Ljubimsky (Russia) Alexander Tomilin (Russia) 

Rudiger Loos (Germany) Enn Tyugu (Sweden) 

Bernhard Moller (Germany) Andrei Voronkov (Sweden) 

Hanspeter Mossenbock (Austria) Tatyana Yakhno (Russia) 

Valery Nepomniaschy (Russia) Zhou Chaochen (Macau) 



Additional Referees 



P. A. Abdulla 


M. Korovina 


U. Sarkans 


I. Anureev 


G. Kucherov 


K. Schneider 


C. Bunse 


S. Krivoi 


W. Schwerin 


K. Cerans 


K. Lellahi 


N. Shilov 


Dang Van Hung 


F. Moller 


T. Stauner 


T. Ehm 


0. Muller 


M. Tudruj 


S. Gaissaryan 


A. Mycroft 


M. Valiev 


A. Godlevskiy 


J. Philipps 


D. von Oheimb 


M. Gorbunov-Posadov 


K. Podnieks 


J. Winkovski 


T. Jen 


A. Sabelfeld 


Xu Qiwen 




VIII Organization 



Conference Secretary 

Natalia Cheremnykh (Novosibirsk, Russia) 



Local Organizing Committee 



Sergei Kuznetsov 
Gennady Alexeev 
Alexander Bystrov 
Tatyana Churina 



Vladimir Detushev 
Olga Drobyshevich 
Vera Ivanova 
Vladimir Sergeev 



Anna Shelukhina 
Irina Zanina 



Sponsors 

Support from the following institutions is gratefully acknowledged: 

• Russian Foundation for Basic Research 

• Office of Naval Research, USA 

• Nortel Networks, Canada 

• Relativity Technologies, Inc, USA 

• UN University’s International Institute for Software Technology, Macau 




Table of Contents 



Algebraic Specifications 

The Common Framework Initiative for Algebraic Specification and 

Development of Software (Invited Talk) 1 

D. Sannella 

A Logical Approach to Specification of Hybrid Systems 10 

M. V. Korovina, 0. V. Kudinov 

Specifications with States 

Algebraic Imperative Specifications (Invited Talk) 17 

M.-C. Gaudel, A. Zamulin 

Enhanced Control Flow Graphs in Montages 40 

M. Anlauff, Ph. W. Kutter, A. Pierantonio 

Abstract State Machines for the Composition of Architectural Styles 54 

A. Siinbiil 

Partial Evaluation and Supercompilation 

The Essence of Program Transformation by Partial Evaluation and Driving 
(Invited Talk) 62 

N. D. Jones 

Binding-Time Analysis in Partial Evaluation: One Size Does Not Fit All . . 80 
N. H. Christensen, R. Glilek, S. Laursen 

Abstraction-Based Partial Deduction for Solving Inverse Problems — 

A Transformational Approach to Software Verification 93 

R. Gliiek, M. Leuschel 

Sonic Partial Deduction 101 

J. Martin, M. Leusehel 

On Perfect Supercompilation 113 

J. P. Secher, M. H. S0rensen 

Linear Time Self-Interpretation of the Pure Lambda Calculus 128 

T. H. Mogensen 

An Optimal Algorithm for Purging Regular Schemes 143 

D. L. Uvarov 




X 



Table of Contents 



Polymorphism in OBJ-P 149 

M. Plumicke 

Concurrency and Parallelism 

Formal Modelling of Services for Getting a Better Understanding of the 

Feature Interaction Problem (Invited Talk) 155 

P. Gibson, D. Mery 

Serializability Preserving Extensions of Concurrency Control Protocols . . . 180 

D. Chkliaev, J. Hooman, P. van der Stok 

Platform Independent Approach for Detecting Shared Memory 

Parallelism 194 

Yu. V. Chelomin 

Hierarchical Cause- Effect Structures 198 

A. P. Ustimenko 

Some Decidability Results for Nested Petri Nets 208 

I. A. Lomazova, Ph. Sehnoebelen 

Abstract Structures for Communication between Processes 221 

G. Ciobanu, E. F. Olariu 

Logic and Processes 

Applying Temporal Logic to Analysis of Behavior of Cooperating Logic 

Programs 228 

M. I. Dekhtyar, A. Ja. Dikovsky,, M. K. Valiev 

On Semantics and Correctness of Reactive Rule-Based Programs 235 

M. Lin, J. Malec, S. Nadjm-Tehrani 

Compositional Verification of CCS Processes 247 

M. Dam, D. Gurov 

Compositional Style of Programming FPGAs 257 

E. Triehina 

Languages and Software 

Using Experiments to Build a Body of Knowledge (Invited Talk) 265 

V. Basili, F. Shull, F. Lanubile 

Patterns in Words versus Patterns in Trees: A Brief Survey and New 

Results 283 

G. Kueherov, M. Rusinowiteh 




Table of Contents 



XI 



Extensions: A Technique for Structuring Functional-Logic Programs 297 

R. Caballero, F. J. Lopez-Fraguas 

Language Tools and Programming Systems in Educational Informatics .... 311 

S. S. Kobilov 

Database Programming 

Current Directions in Hyper-Programming (Invited Talk) 316 

R. Morrison, R. C. H. Connor, Q. I. Cutts, A. Dearie, A. Farkas, 

G. N. C. Kirby, R. McGettrick, E. Zirintsis 

Integration of Different Commit/Isolation Protocols in CSCW Systems 
with Shared Data 341 

L. Frank 

A General Object-Oriented Model for Spatial Data 352 

S. Asgari, N. Yonezaki 

Object-Oriented Programming 

Twin — A Design Pattern for Modeling Multiple Inheritance 358 

H. Mossenbock 

A Partial Semantics for Object Data Models with Static Binding 370 

K. Lellahi, R. Souah 

Heterogeneous, Nested STL Containers in C++ 383 

V. Simonis, R. Weiss 

Data Flow Analysis of Java Programs in the Presence of Exceptions 389 

V. I. Shelekhov, S. V. Kuksenko 

Late Adaptation of Method Invocation Semantics 396 

M. Hof 

Constraint Programming 

A Control Language for Designing Constraint Solvers 402 

C. Castro, E. Monfroy 

An Algorithm to Compute Inner Approximations of Relations for Interval 

Constraints 416 

F. Benhamou, F. Goualard, E. Languenou, M. Ghristie 

Constraint Programming Techniques for Solving Problems on Graphs 424 

V. Sidorov, V. Telerman, D. Ushakov 

Extensional Set Library for ECL*PS® 434 

T. Yakhno, E. Petrov 




XII 



Table of Contents 



Model &: Program Checking 

Introducing Mutual Exclusion in Esterel 445 

K. Schneider, V. Sabelfeld 

Experiences with the Application of Symbolic Model Checking to the 
Analysis of Software Specifications 460 

R. J. Anderson, P. Beame, W. Chan, D. Nothin 

Formal Verification of a Compiler Back-End Generic Checker Program. . . . 470 
A. Bold, V. Vialard 

Construction of Verified Compiler Front-Ends with Program-Checking .... 481 
A. Heberle, Th. Gaul, W. Goerigk, G. Goos, W. Zimmermann 

Translating SA/RT Models to Synchronous Reactive Systems: 

An Approximation to Modular Verification Using the SMV Model 

Checker 493 

G. de la Riva, J. Tuya, J. R. de Diego 

Artificial Intelligence 

Multi- agent Optimal Path Planning for Mobile Robots in Environment 
with Obstacles 503 

F. A. Kolushev, A. A. Bogdanov 

Approach to Understanding Weather Forecast Telegrams with Agent-Based 

Technique 511 

/. S. Kononenko, I. G. Popov, Yu. A. Zagorulko 

Approach to Development of a System for Speech Interaction with an 
Intelligent Robot 517 

G. B. Cheblakov, F. G. Dinenberg, D. Ya. Levin, I. G. Popov, 

Yu. A. Zagorulko 

Analysis of Sign Languages: A Step Towards Multi-lingual Machine 
Translation for Sign Languages 530 

S. Herath, Ch. Saito, A. Herath 

Author Index 539 




The Common Framework Initiative for Algebraic 
Specification and Development of Software* 



Donald Sannella 

Laboratory for Foundations of Computer Science 
University of Edinburgh, UK 
dtsSdcs . ed.ac.uk, www.dcs . ed.ac.uk/~dts/ 



Abstract. The Common Framework Initiative (CoFI) is an open in- 
ternational collaboration which aims to provide a common framework 
for algebraic specification and development of software. The central ele- 
ment of the Common Framework is a specification language called Casl 
for formal specification of functional requirements and modular software 
design which subsumes many previous algebraic specification languages. 
This paper is a brief summary of past and present work on CoFI. 



1 Introduction 

Algebraic specification is one of the most extensively-developed approaches in the 
formal methods area. The most fundamental assumption underlying algebraic 
specification is that programs are modelled as many-sorted algebras consisting 
of a collection of sets of data values together with functions over those sets. 
This level of abstraction is commensurate with the view that the correctness 
of the input /output behaviour of a program takes precedence over all its other 
properties. Another common element is that specifications of programs consist 
mainly of logical axioms, usually in a logical system in which equality has a 
prominent role, describing the properties that the functions are required to sat- 
isfy. This property -oriented approach is in contrast to so-called model- oriented 
specifications in frameworks like VDM which consist of a simple realization of 
the required behaviour. Confusingly — because the theoretical basis of algebraic 
specification is largely in terms of constructions on algebraic models — it is at 
the same time much more model-oriented than approaches such as those based 
on type theory (see e.g. [NPS90]), where the emphasis is almost entirely on syn- 
tax and formal systems of rules while semantic models are absent or regarded 
as of secondary importance. 

The past 25 years has seen a great deal of research on the theory and 
practice of algebraic specification. Overviews of this material include [Wir90], 
[BKLOS91], [LEW96], [ST97], [AKK99] and [ST??]. Developments on the foun- 
dational side have been balanced by work on applications, but despite a number 
of success stories, industrial adoption has so far been limited. The proliferation of 

* This research was supported by the ESPRIT-funded CoFI Working Group. 
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algebraic specification languages is seen as a significant obstacle to the dissemi- 
nation and use of these techniques. Despite extensive past collaboration between 
the main research groups involved and a high degree of agreement concerning the 
basic concepts, the field has given the appearance of being extremely fragmented, 
with no de facto standard specification language, let alone an international stan- 
dard. Moreover, although many tools supporting the use of algebraic techniques 
have been developed in the academic community, none of them has gained wide 
acceptance, at least partly because of their isolated usability: each tool uses a 
different specification language. 

Since late 1995, work has been underway in an attempt to remedy this situ- 
ation. The Common Framework Initiative (abbreviated CoFI) is an open inter- 
national collaboration which aims to provide a common framework for algebraic 
specification and development of software. The Common Framework is intended 
to be attractive to researchers in the field as a common basis for their work, and 
to ultimately become attractive for use in industry. The central element of the 
Common Framework is a specification language called Casl (the Common Al- 
gebraic Specification Language), intended for formal specification of functional 
requirements and modular software design and subsuming many previous spec- 
ification languages. Development of prototyping and verification tools for Casl 
will lead to them being interoperable, i.e. capable of being used in combination 
rather than in isolation. 

Most effort to date has concentrated on the design of Casl, which concluded 
in late 1998. Even though the intention was to base the design on a critical se- 
lection of concepts and constructs from existing specification languages, it was 
not easy to reach a consensus on a coherent language design. A great deal of 
careful consideration was given to the effect that the constructs available in the 
language would have on such aspects as the methodology for formal development 
of modular software from specifications and the ease of constructing appropriate 
support tools. A complete formal semantics for Casl was produced in paral- 
lel with the later stages of the language design, and the desire for a relatively 
straightforward semantics was one factor in the choice between various alterna- 
tives in the design. Work on CoFI has been an activity of Ifip WG 1.3 and the 
design of Casl has been approved by this group. 

This paper is a brief summary of work in CoFI with pointers to information 
available elsewhere. Casl is given special prominence since it is the main concrete 
product of CoFI so far. A more extensive description of the rationale behind 
CoFI and Casl may be found in [Mos97] and [Mos99]. 

2 CASL 

Casl represents a consolidation of past work on the design of algebraic specifica- 
tion languages. With a few minor exceptions, all its features are present in some 
form in other languages but there is no language that comes close to subsuming 
it. Designing a language with this particular novel collection of features required 
solutions to a number of subtle problems in the interaction between features. 
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It soon became clear that no single language could suit all purposes. On 
one hand, sophisticated features are required to deal with specific programming 
paradigms and special applications. On the other, important methods for pro- 
totyping and reasoning about specifications only work in the absence of certain 
features: for instance, term rewriting requires specifications with equational or 
conditional equational axioms. 

Casl is therefore the heart of a family of languages. Some tools will make use 
of well-delineated sub-languages of Casl obtained by syntactic or semantic re- 
strictions, while extensions of Casl will be defined to support various paradigms 
and applications. The design of Casl took account of some of the planned ex- 
tensions, particularly one that involves higher-order functions [MHK98] , and this 
had an important impact on decisions concerning matters like concrete syntax. 

Casl consists of the following major parts or “layers”: basic specifications; 
structured specifications; architectural specifications; specification libraries. A 
detailed description of the features of Casl may be found in [Mos99] and the 
complete language definition is in [CoFI98]. Here we just give a quick overview 
and a couple of simple examples in the hope that this will give a feeling for what 
Casl is like. Further examples may be found in the appendices of [CoFI98]. 
Since features of various existing specification languages have found their way 
into Casl in some form, there are of course many interesting relationships with 
other languages. It is not the purpose of this paper to detail these so many 
relevant references are omitted. 

A Casl basic specification denotes a class of many-soried partial first-order 
structures: algebras where the functions are partial or total, and where also 
predicates are allowed. These are classified by signatures, which list sort names, 
partial and total function names, and predicate names, together with profiles of 
functions and predicates. The sorts are partially ordered by a subsort inclusion 
relation, which is interpreted as embedding rather than set-theoretic inclusion, 
and is required to commute with overloaded functions. A Casl basic specifica- 
tion includes declarations to introduce components of signatures and axioms to 
give properties of structures that are to be considered as models of a specifica- 
tion. Axioms are written in first-order logic (so, with quantifiers and the usual 
logical connectives) built over atomic formulae which include strong and exis- 
tential equalities, definedness formulae and predicate applications, with gener- 
ation constraints added as special, non-first-order sentences. The interpretation 
of formulae is as in classical two- valued first-order logic, in contrast to some 
frameworks that accommodate partial functions. Concise syntax is provided for 
specifications of “datatypes” with constructor and selector functions. 

Here is an example of a basic specification: 

free types Nat ::= 0 \ sort Pos] 

Pos ::= suc{pre : Nat) 
op pre : Nat -^1 Nat 
cixioms 

-uie/ pre{0); 

Vn : Nat • pre{suc{n)) = n 
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pred even.. : Nat 

var n : Nat 

• even 0 

• even suc{n) ~^even n 

The remaining features of Casl do not depend on the details of the features 
for basic specifications, so this part of the design is orthogonal to the rest. An 
important consequence of this is that sub-languages and extensions of Casl 
can be defined by restricting or extending the language of basic specifications 
(under certain conditions) without the need to reconsider or change the rest of 
the language. 

Casl provides ways of building complex specifications out of simpler ones 
(the simplest ones being basic specifications) by means of various specification- 
building operations. These include translation, hiding, union, and both free and 
loose forms of extension. A structured specification denotes a class of many- 
sorted partial first-order structures, as with basic specifications. Thus the struc- 
ture of a specification is not reflected in its models: it is used only to present the 
specification in a modular style. Structured specifications may be named and a 
named specification may be generic, meaning that it declares some parameters 
that need to be instantiated when it is used. Instantiation is a matter of pro- 
viding an appropriate argument specification together with a fitting morphism 
from the parameter to the argument specification. Fitting may also be accom- 
plished by the use of named views between specifications. Generic specifications 
correspond to what is known in other specification languages as [pushout- style) 
parametrized specifications. 

Here is an example of a generic specification (referencing a specification 
named Partial_Order, which is assumed to declare the sort Elem and the 
predicate __ < __): 

spec List_with_Order [Partial .Order] = 

free type List[Elem] ::= nil \ cons{hd -.1 Elem] tl :? List[Elem]) 

then 

local 

op insert : Elem x List[Elem] List[Elem]; 
vars X, y : Elem] I : List[Elem] 
axioms insert{x, nil) = cons{x,nil)] 

X < y ^ insert{x, cons{y, 1)) = cons{x, insert{y, /)); 

< y) ^ insert{x, cons{y, 1)) = cons{y, insert{x, 1)) 

within 

pred order).. < ..] : List[Elem] X List[Elem] 
vars X : Elem] I : List[Elem] 
axioms order).. < __](mZ) = nil] 

order).. < .f\{cons{x, 1)) = insert {x, order).. < __](/)) 

end 

Architectural specifications in Casl are for describing the modular struc- 
ture of software, in constrast to structured specifications where the structure 
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is only for presentation purposes. Architectural specifications are probably the 
most novel aspect of Casl; they are not entirely new, but they have no coun- 
terpart in most algebraic specification languages. An architectural specification 
consists of a list of unit declarations, indicating the component modules required 
with specifications for each of them, together with a unit term that describes 
the way in which these modules are to be combined. (There is an unfortunate 
potential for confusion here: in Casl, the term “architecture” refers to the “im- 
plementation” modular structure of the system rather than to the “interaction” 
relationships between modules in the sense of [AG97].) Units are normally func- 
tions which map structures to structures, where the specification of the unit 
specifies properties that the argument structure is required to satisfy as well 
as properties that are guaranteed of the result. These functions are required to 
be persistent, meaning that the argument structure is preserved intact in the 
result structure. This corresponds to the fact that a software module must use 
its imports as supplied without altering them. 

Here is a simple example of an architectural specification (referencing ordi- 
nary specifications named List, Char, and Nat, assumed to declare the sorts 
Elem and List[Elem], Char, and Nat, respectively): 

arch spec CN_LlST = 
units 

C : Char ; 

N : Nat ; 

F : Elem ^ List [Elem] 

result F[C fit Elem i— > Char] and F[N fit Elem i— > Nat] 

More about architectural specifications, including further examples, may be 
found in [BST99]. 

Libraries in Casl are collections of named specifications. A specification can 
refer to an item in a library by giving its name and the location of the library that 
contains it. Casl includes direct support for establishing distributed libraries on 
the Internet with version control. 

3 Semantics 

The formal semantics of Casl, which is complete but whose presentation still 
requires some work, is in [CoFI99j. The semantics is divided into the same parts 
as the language definition (basic specifications, structured specifications, etc.) 
but in each part there is also a split into static semantics and model semantics. 

The static semantics checks well-formedness of phrases and produces a “syn- 
tactic” object as result, failing to produce any result for ill-formed phrases. For 
example, for a basic specification the static semantics yields a theory presenta- 
tion containing the sorts, function symbols, predicate symbols and axioms that 
belong to the specification. (Actually it yields an enrichment: when a basic spec- 
ification is used to extend an existing specification it may refer to existing sorts. 
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functions and predicates.) A phrase may be ill- formed beeause it makes reference 
to non-existent identifiers or because it contains a sub-phrase that fails to type 
cheek. The model semantics provides the corresponding model-theoretic part of 
the semantics, and is intended to be applied only to phrases that are well-formed 
according to the static semantics. For a basic specification, the model semantics 
yields a class of models. A statically well-formed phrase may still be ill-formed 
according to the model semantics: for example, if a generic specification is in- 
stantiated with an argument specification that has an appropriate signature but 
which has models that fail to satisfy the axioms in the parameter specification, 
then the result is undefined. The judgements of the static and model semantics 
are defined inductively by means of rules in the style of Natural Semantics. 

The orthogonality of basic specifications in Casl with respect to the rest of 
the language is reflected in the semantics by the use of a variant of the notion 
of institution [GB92] called an institution with symbols [Mos98]. (For readers 
who are unfamiliar with the notion of institution, it corresponds roughly to 
“logical system appropriate for writing specifications”.) The semantics of basic 
specifications is regarded as defining a particular institution with symbols, and 
the rest of the semantics is based on an arbitrary institution with symbols. 

The semantics provides a basis for the development of a proof system for 
Casl. As usual, at least three levels are needed: proving consequences of sets of 
axioms; proving consequences of structured specifications; and finally, proving 
the refinement relation between structured specifications. The semantics of Casl 
gives a reference point for checking the soundness of each of the proposed proof 
systems and for studying their completeness. 

4 Methodology 

The original motivation for work on algebraic specification was to enable the 
stepwise development of correct software systems from specifications with veri- 
fied refinement steps. Casl provides good support for the production of specifi- 
cations both of the problem to be solved and of components of the solution, but it 
does not incorporate a specific notion of refinement. Architectural specifications 
go some way towards relating different stages of development but they do not 
provide the full answer. Other methodological issues concern the “endpoints” of 
the software development process: how the original specification is obtained in 
the first place (requirements engineering), and how the transition is made from 
Casl to a given programming language. Finally, the usual issues in programming 
methodology are relevant here, for instance: verification versus testing; software 
reuse and specification reuse; software reverse engineering; software evolution. 

Casl has been designed to accommodate multiple methodologies. Various 
existing methodologies and styles of use of algebraic specifications have been 
considered during the design of Casl to avoid unnecessary difficulties for users 
who are accustomed to a certain way of doing things. For the sake of concreteness, 
the present author prefers the methodology espoused in [ST97], and work on 
adapting this methodology to Casl has begun. 
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5 Support Tools 

Tool activity initially focussed on the concrete syntax of Casl to provide feed- 
back to the language design since the exact details of the concrete syntax can 
have major repercussions for parsing. Casl offers a flexible syntax with mixfix 
notation for application of functions and predicates to arguments, which re- 
quires relatively advanced parsing methods. ASF+SDF was used to prototype 
the Casl syntax in the course of its design, and several other parsers have been 
developed concurrently. Also available is a package for uniform formatting 

of Casl specifications with easy conversion to HTML format. ATerms [BK098] 
have been chosen as the common interchange format for CoFI tools. This pro- 
vides a tree representation for various objects (programs, specifications, abstract 
syntax trees, proofs) and annotations to store computed results so that one tool 
can conveniently pass information to another. Work is underway on a format for 
annotations and on a list of specific kinds of annotations. 

At present, the principal focus of tools work in CoFI is on adapting tools 
that already exist for use with Casl. Existing rewrite engines such as in OBJ, 
ASF+SDF and ELAN should provide a good basis for prototyping (parts of) 
Casl specifications. For verification tools, we plan to reuse existing proof tools 
for specific subsets of Casl: equational, conditional, full first-order logic with 
total functions, total functions with subsorts, partial functions, etc. The integra- 
tion of proof tools such as SPIKE, EXPANDER and others will provide the po- 
tential to perform proofs by induction, observational proofs, termination proofs, 
etc. One system on which development is already well-advanced is HOL-CASL 
[MKK98] which provides static analysis of Casl specifications and theorem prov- 
ing via an encoding into the Isabelle/HOL theorem prover [Pau94]. Another is 
INKA 5.0 [AHMS99] which provides theorem proving for a sub-language of Casl 
that excludes partial functions. 

6 Specification of Reactive Systems 

An area of particular interest for applications is that of reactive, concurrent, 
distributed and real-time systems. There is considerable past work in algebraic 
specification that tackles systems of this kind, but nonetheless the application of 
Casl to such systems in speculative and preliminary in comparison with the rest 
of CoEI. The aim here is to propose and develop one or more extensions of CASL 
to deal with systems of this kind, and to study methods for developing software 
from such specifications. Extensions in three main categories are currently being 
considered: 

— Combination of formalisms for concurrency (e.g. CCS, Petri nets, CSP) with 
Casl for handling classical (static) data structures; 

— Formalisms built over Casl, where processes are treated as special dynamic 
data; and 

— Approaches where CASL is used for coding at the meta-level some formalism 
for concurrency, as an aid to reasoning. 
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Work in this area begun only after the design of Casl was complete and so it 
is still in its early stages. 

7 Invitation 

CoFI is an open collaboration, and new participants are welcome to join at any 
time. Anybody who wishes to contribute is warmly invited to visit the CoFI web 
site at http://www.brics.dk/Projects/CoFI/ where all CoFI documentation, 
design notes, minutes of past meetings etc. are freely available. Announcements 
of general interest to CoFI participants are broadcast on the low-volume mailing 
list cofi-list@brics.dk and each task group has its own mailing list; see the 
CoFI web site for subscription instructions. All of these mailing lists are mod- 
erated. Funding from the European Commission is available until September 
2000 to cover travel to CoFI meetings although there are strict rules concerning 
eligibility, see http://www.dcs.ed.ac.uk/home/dts/CoFI-WG/. 

Acknowledgements. Many thanks to all the participants of CoFI, and in particular 
to the coordinators of the various CoFI Task Groups: Bernd Krieg-Briickner (Language 
Design); Andrzej Tarlecki (Semantics); Michel Bidoit (Methodology); Helene Kirchner 
(Tools); Egidio Astesiano (Reactive Systems); and especially Peter Mosses (External 
Relations) who started CoEI and acted as overall coordinator until mid-1998. 
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Abstract. The main subject of our investigation is behaviour of the 
continuous components of hybrid systems. By a hybrid system we mean 
a network of digital and analog devices interacting at discrete times. A 
first-order logical formalization of hybrid systems is proposed in which 
the trajectories of the continuous components are presented by majorant- 
computable functionals. 



1 Introduction 

In the recent time, attention to the problems of exact mathematical formaliza- 
tion of complex systems such as hybrid systems is constantly raised. By a hybrid 
system we mean a network of digital and analog devices interacting at discrete 
times. An important characteristic of hybrid systems is that they incorporate 
both continuous components, usually called plants, as well as digital components, 
i.e. digital computers, sensors and actuators controlled by programs. These pro- 
grams are designed to select, control, and supervise the behaviours of the contin- 
uous components. Modelling, design, and investigation of behaviours of hybrid 
systems have recently become active areas of research in computer science (for 
example see [7,10,11,15,16,19]). We use the models of hybrid systems proposed 
by Nerode, Kohn in [19]. 

A hybrid system is a system which consists of a continuous plant that is 
disturbed by external world and controlled by a program implemented on a 
sequential automaton. The control program reads sensor data, a sensor function 
of state of the plant sampled at discrete times, computes the next control law, 
and imposes it on the plant. The plant will continue using this control law until 
the next such intervention. 

A representation of external world is an input data of the plant. The control 
automaton has input data (the set of sensor measurements) and the output data 
(the set of control laws). The control automaton is modelled by three units. The 
first unit is a converter which converts each measurement into input symbols 
of the internal control automaton. The internal control automaton, in practice. 
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is a finite state automaton with finite input and output alphabets. The second 
unit is the internal control automaton, which has a symbolic representation of a 
measurement as input and produces a symbolic representation of the next control 
law to be imposed on the plant as output. The third unit is a converter which 
converts these output symbols representing control laws into the actual control 
laws imposed on the plant. The plant interacts with the external world and the 
control automata at times ti, where the time sequence {U} satisfies realizability 
requirements. 

The main subject of our investigation is behaviour of the continuous com- 
ponents. In [19], the set of all possible trajectories of the plant was called as 
a performance specification. We propose a first-order logical formalization of 
hybrid systems in which the trajectories of the continuous components (the per- 
formance specification) are presented by majorant-computable functionals. The 
following properties are the main characteristic properties of our approach. 

1. An information about the external world is represented by a majorant-compu- 
table real-valued function. In nontrivial cases for proper behaviour our system 
should analyse some complicated external information at every moment when 
such information can be processed. In general case, we can’t represent this infor- 
mation by several real numbers because the laws of behaviours of the external 
world may be unknown in advance. Note that an external information should be 
measured so, in some sense, it is computable. According this reasons we present 
an external information by a majorant-computable real-valued function. 

2. The plant is given by a real-valued functional. At the moment of interaction, 
using the law computed by the discrete device, the plant transforms external 
function to a real value which is the output for the plant. So the theory of 
majorant-computable functionals is adequate mathematical tool for a formaliza- 
tion of the mentioned phenomena. Although the differential operator is not used 
as a basic one, this formalization is compatible with representation of the plant 
by an ordinary differential equation (see [13,20]). Really, if there exists some 
method for approximate computing of the solution to the differential equation 
that is based on difference operators like the Galerkin method, then such solution 
can be described by a computable functional (see [13,20]). 

3. The trajectories of plants are described by computable functionals. So the 
trajectories are exactly characterized in logical terms (via A-formulas). Thus, 
the proposition is proved which connects the trajectory of a plant with validity 
of two A-formulas in the basic model. 



2 Basic Notions 

To construct a formalization of hybrid systems we introduce a basic model and 
recall the notions of majorant-computability of real- valued functions and func- 
tionals. To specify complicated systems such as hybrid systems we extend the 
real numbers IR by adding the list superstructure T(IR), the set of finite se- 
quences (words), A*, of elements of A, where A is a finite alphabet, together 
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with the predicates Pat for each elements of A, and appropriate operations 
for working with elements of L(1R) and A*. 

We consider the many-sorted model M =< HW(IR), A* > with the following 
sorts: 

1. HW(IR) = (IR; A(IR),cons,G;, []), where 

]R=< IR, 0,1, +,-,<> is the standard model of the reals, denoted also by IR; 
the set L(IR) is constructed by induction: 

(a) Ao(lR) = IR; 

(b) Li+i=the set of finite ordered sequences (lists) of elements of IRuLi(IR); 

(c) L(IR)=Ue^f^.(]R). 

(d) cthw(ir) = {0, 1, +, •, <} U {cons, e, []}, where cons,e, [] (empty list) are 
defined in standard way ( see [8]). 

At first this structure was proposed by Backus in [1], now, it is rather well 
studied in [2,5,8], This structure enables us to define the natural numbers, 
to code, and to store information via formulas. 

2. A* =< A*^(Ja* > is the set of finite sequences (words) of elements of A, 
where A = (ui, . . . , a^} is a finite alphabet. The elements of the language 
<^A* = {^ai j • • • j ; =; G, cone, ()} are defined in standard way (see [23]) . 

3. ctm = cthw(IR) U <ja* U {*}, where * are defined in the following way: 

(a) * : A* X HW(IR) ^ HW(IR), 

(b) (aq, . . . ,aq) * [xi, . . . ,x„] = [j/i , . . . ,ym], where m = min(ife,n) and 



% = 



f Xj if Uq. — 

1 0 otherwise . 



The variables of ctm subject to the following conventions: a, 6, c, d, . . . range over 
IR, fi, ^ 2 , • • • range over T(1R), x,y,z, . . . range over IR U T(1R), ai, . . . a„ range 
over A, a,(3,j,w, . . . range over A*. This notation gives us easy way to assert 
that something holds of real numbers, of lists, or of words. 

The notions of a term and an atomic formula in the languages crjj-yv(iR) and 
a A* are given in a standard manner. 

The set of atomic formulas in cjm is the union of the sets of atomic formu- 
las in cthw(ir)) cta*, and the set of formulas of the type w * li = Ij. The set 
of Ao -formulas in ctm is the closure of the set of atomic formulas in cjm un- 
der A,V,-i,3x G l,Vx G /,3a G w and Va G w. The set of U-formulas in cjm is 
the closure of the set of Z\o-formulas under A,V,3x G /,Vx G /,3a G w,Va G w, 
and 3. We define U -formulas as negations of A’-formulas. 

We use definability as one of the basic conceptions. Montague [17] proposed 
to consider computability from the point of view of definability. Later, many 
authors among them Ershov [5], Moschovakis [18] paid attention to properties 
of this approach applied to various basic models. 

Definition 1. 1. A set B C HW(IR) X (A*)” is B-definable if there exists a 27- 
formula #(x) sueh that x G i? M ^ #(x). 2. A funetion f is S -definable if 
its graph is B-definable 
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In a similar way, we define the notions of II -definable functions and sets. The 
class of A-definable functions {sets) is the intersection of the class A-definable 
functions (sets) and the class of il-definable functions (sets). Properties of A-, 
77-, A- definable sets and functions were investigated in [5,8,12], Note only that 
Zi-definable sets are analogies of recursive sets on the natural numbers. 

We will use majorant-computable functions and functionals to formalize in- 
formation about external world and plants. Let us recall the notion of com- 
putability for real- valued functions and functional proposed and investigated in 
[12,13]. A real-valued function (functionals) is said to be majorant-eomputable if 
we can construct a special kind of nonterminating process computing approxi- 
mations closer and closer to the result. 

Definition 2. A funetion f : IR" ^ IR is ealled majorant-eomputable if there 
exist an effective sequence of U -formulas {^«(x, j/)}gg,^ and an effective sequence 
of n -formulas {G's(x, j/)}sgtj sueh that the following eonditions hold. 

1. For all s E u>, X E IR", the formulas <Pg md Gg define the same nonempty 
interval < as,fis >■ 

2. For all x e IR"", the sequence {< as,fis deereases monotonically, i.e., 

< as+i,f3s+i >C< ag,f3s > for s E iv; 

3. For all x G dom(/), /(x) = y ^ flse,.; < cts,fis >= {v} holds. 

For formalization of information about external world we will use the following 
set. JT = {/]/ is a majorant-computable total real-valued function}. 

An important property of a total real-valued function, which will be used be- 
low, is that the function is majorant-computable if and only if its epigraph and 
ordinate set are 27-definable (i.e. effective sets). 

Definition 3. Let g\ be Gddel numbering of a set A\, g 2 be Gddel numbering 
of a set A 2 . A procedure h : Ai ^ A 2 is said to be effective procedure if there 
exists reeursive funetion f sueh that the following diagram is commutative 

A 4 A 

gi i 92 [ 

A\ ^ A 2 . 

Denote the set of A-formulas by S and the set of 77-formulas H. 

Definition 4. A set R C ]R"'+^ x R is said to be U-definable by an effective 
procedure (p : S x S ^ S i/ for each majorant-computable function f and for 
U-formulas A(x, y), B{x,y) with the following conditions: 

/(x) = y ^ A(x, ■) <y < B{x, •) and {z \ A(x, z)} U {z \ B{x, x)} = IR \ {y} 
the following proposition holds M ]= R{x,y,f) <-> M ^ (p{A,B){x,y). 

In a similar way, we define the notion of II -definable functional by an effective 
procedure : S x S ^ H. 
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Definition 5. A functional F : IR"" x JF ^ ]R is called majorant-computable 
if there exists effective sequence of sets where each element Rg is F- 

definable by an effective procedure cps o-nd U-definable by an effective procedure 
tfs, such that the following properties hold: 

1. For all s e Lv, the set Rs{:>e, •, /) is a nonempty interval; 

2. For all x G IR"' and f T , the sequence {R„(x, •, /)}gg^ decreases mono- 
tonically; 

3. For all (x, /) e dom(F’), F’(x, f)^y ^ x /) = {v} holds. 

3 Specifications of Hybrid Systems 

Let us consider hybrid systems of the type considered in Introduction. A speci- 
fication of the hybrid system SHS = {TS,F,Canvl,A,Canv2,I) consists of: 

• TS = It is an effective sequence of real numbers. The real numbers f 

are the times of communication of the external world and the hybrid system, 
and the plant and the control automata. The time sequence satisfies 

the realizability requirements: 

1. For every i, f > 0; 

2 . to < h < . . . < ti . . 

3. The differences ti+i - U have positive lower bounds. 

• F : HW(IR) X JF ^ ]R. It is a majorant-computable functional. The be- 
haviour of the plant is modelled by this functional. 

• Convl : IN x A*. It is an effective procedure. At the time of commu- 

nication this procedure converts the number of time interval, measurements 
presented by two A-formulas into finite words which are input words of the 
internal control automata. 

• A : A* ^ A*. It is a A’-definable function. The internal control automata, 
in practice, is a finite state automata with finite input and finite output 
alphabets. So, it is naturally modelled by A-definable function (see [5,8,12]) 
which has a symbolic representation of measurements as input and produces 
a symbolic representation of the next control law as output. 

• Conv2 : A* HW(IR). It is a A-definable function. This function converts 
finite words representing control laws into control laws imposed on the plant. 

• / C A* U HW(IR). It is a finite set of initial conditions. 



Theorem 1. Suppose a hybrid system is specified as above. Then the trajectory 
of the hybrid system is defined by a majorant-computable functional. 

Proof. Let SHS = {TS,F,Convl,A,Canv2,I) be a specification of the hybrid 
system. We consider behaviour of the hybrid system in terms of our specification 
on [ti,ti+i]- Let Ffti, z, /) = yi, where Zi represents the recent control law , and 
yi is the state of the plant at the time f. 

At the moment f Converter 1 gets measurements of recent states of the plant 
as input. By properties of majorant-computable functionals, these measurements 
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can be presented by two A-forniulas which code methods of computations of 
measurements. These representations are compatible with real measurements. 
Indeed, using different approaches to process some external signals from the 
plant, Converter 1 may transform it to different results. This note is taken 
into account in our formalization of Converter 1 . Thus, Gonvl is a i7-definable 
function and its arguments are the methods of computations of measurements. 
The meaning of the function Gonvl is an input word wi of the digital automaton 
which is presented by A. By wi the function A computes new control law W 2 
and Gonv2 transforms it to z. 

The plant transforms new information about external world presented by / 
to recent states of the plant according to the control law i, i.e., y = 
for t E [ti,ti+i]- The theorem states that there exists a major ant-computable 
functional F such that y{t) = F{t, /). 

By Definition, F{t,z,f) is majorant-computable functional. Denote the ini- 
tial time by to and the initial position of the plan by j/o- Let / be a majorant- 
computable function, and O be its ordinate set, F be its epigraph. By the prop- 
erties of majorant-computable functionals (see [13,14] ) there exist two effective 
procedures hi,h .2 such that 

L’(x, f) = y ^ hi{0, E){x, ■) <y < /i2(0, £^)(x, •) and 
{z I hi{0,E){yi,z)} U {z \ /j2(0, £^)(x, z)} = IR\ {y} 

Denote ^ {y > yo) , ^ (y < yo) . For t e [to, h] put: 

E){t, y) 3wi3w23a[C'ont'l(l, ) = wi A A{v}\) = W 2 A 

Gonv2{w2,a) A /Ji(0, E){t, a, y)], 

02(0, E){t, y) 3wi3w23a[C'ont'l(l, , ^>0) = wi A A{v}\) = W 2 A 

Gonv2{w2, a) A /i2(0, E)(t, a, y)]. 

In the same way we can construct the procedure 0i, 02 for each interval [0, t^+ij. 
Put 



F{t,f) = y ^ 0i(O,£l)(t,-) <y < 02(0, £l)(t, •) and 
{z I 0i(O, E){t, z)} U {z I 02(0, E){t, z)} = lR\ {y} 

By constructions, the functional F is majorant-computable and defines the tra- 
jectory of the hybrid system with SHS specification. □ 

This paper has presented the description of trajectories in terms of majorant- 
computable functionals which can be constructed by the specifications SHS of 
hybrid systems. The preliminary results suggest possible directions for future 
applications to study real hybrid systems. 
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Abstract. Algebraic imperative specifications (AIS) are specifications 
with implicit state represented by an algebra and with a number of tran- 
sition rules indicating state transformations. They are designed for the 
formal definition of complex dynamic systems. 

Two approaches to algebraic imperative specifications have been devel- 
oped in parallel during the last decade: Abstract State Machines (ASMs), 
initially known as evolving algebras, and Algebraic Specifications with 
Implicit State (AS-IS). Moreover typed versions of ASM have been de- 
veloped which have incorporated some aspects of AS-IS. 

This survey paper provides a guided tour of these imperative approaches 
of specification based on the state-as-algebra paradigm, and sketches 
a synthesis of two of them, under the name of dynamic systems with 
implicit state. 



1 Introduction 

Algebraic imperative specifications (AIS) are specifications with implicit state 
represented by an algebra and with a number of transition rules indicating state 
transformations. They are designed for the formal definition of complex dynamic 
systems. 

It is a fact that a complex system to be implemented in some programming 
language usually possesses static and dynamic features. The static features are 
represented by a number of data types involved and a number of functions de- 
fined over them. The dynamic features are represented by a number of states 
the system can be in and a number of operations (procedures, modifiers) trans- 
forming the states. 

Conventional algebraic specifications [12,13,38] have proved to be an elegant 
and effective way of defining the static aspects of such a system. Using this 
technique, one can define a number of data types (sets with corresponding oper- 
ations) and functions just by providing a signature (i.e., the names of sorts, and 
the names of operations accompanied by their profiles) and a set of axioms lim- 
iting the set of possible models. These data types and operations can be further 
used in the system specification. 
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However, algebraic specifications are less convenient in defining the dynamic 
aspects of a system. In this case, the state has to be defined in some way (for 
example, as a complex data type) and its instances have to be explicitly used as 
arguments and/or results in operations transforming one state into another. As 
a result, the specification becomes very clumsy: it is difficult both to write and 
read. 

In parallel with algebraic specifications, a number of methods involving the 
notion of built-in state have been suggested which avoid the above-mentioned 
problem of the explicit state. The most well-known of them are VDM [31] and 
Z [36,37]. (See [35] for a good review.) One of the latest developments in the 
field is B [1]. The main idea of each of these methods is that all the opera- 
tions transforming the state can be characterized by observing their effect on 
a number of variables (variables are understood here in the same way they are 
understood in programming languages) representing components of the system’s 
state. Therefore, the variable value before the operation and after its execution 
is taken into account and a relation between these two values is specified. It is 
done by a logical formula relating pre-operation and post-operation values of 
one or more variables in Z, by giving two formulas specifying the condition to be 
satisfied by the variables before the operation (pre-condition) and the condition 
to be satisfied by them after the operation execution (post-condition) in VDM, 
and by substitution rules in B. For this purpose, special decoration is normally 
proposed for indicating variable values before the operation and after it (hooks 
for pre-operation values in VDM and primes for post-operation values in Z). 

A common feature of the three methods is their use of a fixed number of basic 
types and type constructors for the representation of application data. The usual 
basic types are integers (with their subsets) and scalars given by enumerations. 
The usual type constructors are set constructor, tuple constructor and several 
kinds of function constructors. VDM restricts the set of function constructors 
to finite maps (i.e., partial functions with a finite domain) and offers a sequence 
constructor in addition. Z allows the definitions of binary relations in addition, 
and B does not possess a tuple constructor. 

Another common feature of these methods is that some parts of the semantics 
of some basic notions remain informal. For example, the formal definition of “a 
simple and powerful specification language closely similar to the Z notation” in 
[36] does not explain the notion of state intensively used in its informal semantics. 
There, a not-producing-result operation is said to transform the state while 
its formal specification just sets some relations among primed and non-primed 
names in a model of the signature induced by the operation specification. In 
VDM and B such notions as state, variable, and operation are also introduced 
informally: it is assumed that they are well understood by those who write 
specifications and those who read them. 

However, if we say “constant” instead of “variable” , we can regard the state as 
an algebra with a number of defined constants and functions, and we can regard 
primed and non-primed (or hooked and non-hooked) names as denotations of 
the same constant name in two different algebras. In this case, we can say that 
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a formula relates values associated with a given constant name in two algebras, 
and an operation updating the state can be defined as an algebra transformation. 
Moreover, if we specify the state as an algebra, we can delete the limitations on 
the sets of data types involved. In the specification of a particular application, 
those data types are defined which are practically needed in the application. All 
the power of the algebraic specifications can be used in this case. 

The introduction of the notion of algebra update as a transition from one 
state to another naturally leads us to such form of specification which explicitly 
indicates in which way a constant (a function in the general case) is updated 
in the process of algebra transformation. No decoration of names is needed in 
this case. In parallel with imperative languages, we call this kind of specifica- 
tions algebraic imperative specifications (AIS). The word algebraic emphasizes 
the algebraic nature of the state; the word imperative suggests an analogy with 
imperative languages. 

AIS may be used for describing algorithms: every step of an algorithm can 
be regarded as a transition from one state to another simulated at the most 
appropriate abstraction level. Imperative specifications may be also used for 
describing, in an abstract and non algorithmic way, dynamic features of a system: 
each state transforming operation is described in terms of some complex algebra 
updates. 

Finally, it is generally accepted that the ease (or difficulty) of the implemen- 
tation of a specification heavily depends on its structure and complexity. Since 
the majority of the programs are written in imperative languages, there is much 
more chance that a specification will be read and implemented by a programmer 
if it is imperative. This feature relates AIS to some other specification languages 
which could also be called imperative but not algebraic [1,6]. 

Two approaches to algebraic imperative specifications have been developed in 
parallel during the last decade: Abstract State Machines (ASMs), initially known 
as evolving algebras [28,29], and Algebraic Specifications with Implicit State (AS- 
IS) [8,33]. The main features of AS-IS are presented in the next section. Basic 
notions of ASM and its typed versions are described in Section 3. Dynamic 
systems with implicit state combining some features of the both approaches are 
presented in Section 4. Some related work, all based on the state-as-algebra idea, 
is discussed in Section 5 and some conclusions are given in Section 6. 



2 Algebraic Specifications with Implicit State 

The origins of this approach go back to the 1980’s, to some work on compiler con- 
struction from some formal semantics of the source and target languages [14,15]. 
There, the semantics of imperative languages was modeled by state transforma- 
tions, where the states were many-sorted- algebras. In the area of programming 
language semantics, other approaches generally model states as functions, which, 
roughly, go from some kinds of names into some kinds of values, the domain and 
co-domain of these state functions being unions of sets. Such approaches become 
clumsy when values of complex data types have to be stored and modified: some 
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operations on names must mimic the operations on the data types (such as ac- 
cesses to components and constructors) and adequate commutativity properties 
must be maintained when modifying the state. In [14], it was shown how to use 
many sorted algebras as models of such states, based on the classical idea that 
data types are algebras. Some extensions were invented to take into account the 
notion of variables, assignments being modeled as transformations of algebras. 
The advantage of such a framework for compiler specification is that the rep- 
resentation of the source data types by some target data types can be proved 
using the techniques developed for algebraic specifications [15]. 

Some years later, this first approach served as the inspiration for AS-IS, 
Algebraic Specifications with Implicit State. The motivation for the design of 
AS-IS was a case study on the formal specification of the embedded safety part 
of an automatic subway pilot [9,10]. The specified system was a classical control- 
command loop, where the body of the loop receives some inputs, performs some 
computations, and returns some outputs. Inputs come from some sensors or some 
ground controller. Outputs are alarms, commands, or messages to the ground 
controller. The first formal specification was written in a pure algebraic style, 
using the PLUSS specification language [16]. It turned out that the state of the 
system was characterized by 54 values of various types (abscissa, speed, next 
train, tables, . . .). Most of these values were liable to be updated during some 
cycles of the loop. As a consequence, the specification contained 54 observer 
operations of the state, i.e. operations of profile state X . . . — ^ s, where s is 
a sort different from state, and 54 update operations, i.e. operations of profile 
state X ... X s — ^ state. A long and uninteresting axiomatization of these 
108 operations was needed. In order to shorten the specification, a predefined 
notion of record, similar to the one in VDM, was introduced in the specification 
language. However, it was still boring and redundant to have states as parameters 
everywhere. This has led to the introduction of a concept of implicit state in the 
algebraic specification language. Of course, such a notion must not be limited to 
the special case of a record. Actually, it must be possible to specify any kind of 
data structure, at different abstraction levels, and any evolution concerning the 
implicit state. 

Another, more complex, case study was then performed [17], namely the 
Steam-Boiler Control Problem. It has led to some addition to the formalism, in 
order to avoid too algorithmic specifications of complex evolutions of the system. 
The most recent version of AS-IS is presented in [32] . 

An AS-IS specification is based on a classical algebraic specification which 
describes the data types to be used by the system. This part is clearly isolated 
in the specification and its meaning is stable, whatever modification of the state 
is specified^. The evolving parts of the implicit state are specified as access 
functions whose results depend on the implicit state. 

Example. In a subway example, there may be the following access func- 
tions which correspond to the section of the railway where the train is currently 
located, and a table where the speed limit for each section is stored. 

^ This implies that the carriers remain invariant. 
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Current Section : — > Section 
Local SpeedLimit : Section — ^ Speed 
where the Section and Speed types are specified in the data type part. 

The evolutions of the implicit state are described by modifications of the 
access functions. 

Example. When the train progresses, one may have 
Current Section := next {Current Section) 
or when the weather conditions change 

Vs : Section, Local SpeedLimit{s) := LocalSpeedLimit{s) — 10 
Let A be the signature of the data types, Ax their axioms, and Uac the part 
of the signature corresponding to the names and profiles of the access functions. 
A state is any < if U Uac,Ax >-algebra. A modification is a mapping from 
the < if U Sac, Ax >-algebras into themselves where the interpretation of some 
access functions of the resulting algebra are different from their interpretations 
in the source one. The example modifications above are called elementary since 
each of them involves one access function only. 

In addition to the elementary accesses, such as the ones above, which charac- 
terize the implicit state, there are dependent accesses which are related by some 
property to the other accesses. 

Example. One may define 
Current SpeedLimit : — ^ Speed 

Current SpeedLimit = min{LocalSpeedLimit{CurrentSection), . . .) 

Among the design choices of AS-IS, it was decided to keep the specified 
behaviors deterministic. In order to ensure this, the dependent accesses must 
be defined by a set of axioms which is sufficiently complete with respect to 
the elementary accesses and data types. Thus an AS-IS specification includes, 
in addition to the specification of some data types with signature if satisfying 
some axioms Ax, some elementary access functions whose names and profiles 
are given in a sub-signature ifeac, some dependent access functions specified by 
a sub-signature Sdac and some axioms Axac- Let S' = if U ifeac U Sdac- Then a 
state is any < S' , Ax U Axac >- algebra. 

The semantics of elementary modifications is based on restrictions and ex- 
tensions of the state algebras. First, all the dependent accesses are forgotten. 
Then, if ac is the name of an elementary access being modified, the algebra is 
extended by the new elementary access ac' , with the same profile as ac, which 
is different from ac for the values of the arguments specified in the modification 
(see below) and the same everywhere else. Then ac is forgotten, ac' is renamed 
ac, and the algebra is extended to include the dependent accesses and satisfy 
the corresponding axioms. 

In an AS-IS specification, as soon as an elementary access ac : Si x . . . x s„ — ^ 
s is declared, it is possible to write elementary modifiers of the form 

Vxi : sfy...,Xp : Sp, [ac(7Ti, . . . ,7T„) := i?(7Ti, . . . ,7 t„)] 

where the iTi are terms of T^:'{{x \, . . . , Xp}), of sort Si, which play a role similar 
to patterns in functional programming, and R{tii, . . . ,7t„) is a term built with 
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the constants of if', the tTj, and the operations of if'. Such a modifier induces 
the modification of the result of ac for all the values matching the patterns, i. 
e., if A is the original state and B the modified one, we have: 

Vui,...,u„ in Asj X ... X As^ 

- if there exists an assignment a of the Xi into Ag . , such that 

a(7Ti) = Vi,.. .,a{TTn) = Vn, then ac^{vi , . . . ,u„) = i?(a(7Ti), . . . , a(7r„)) 

- otherwise 

ac^{vi,...,Vn) = ac^{vi,...,Vn)- 

In the above example a quantified elementary modifier is used to specify a 
global change of the local speed limits. 

There is a conditional version of such modifiers, with the same restriction on 
the form of the conditions as on the result : they must involve the tt^ only. 
Vyi,..., 2 /p cases 

01 then ac(7rj, . . . ,7t^) := i?^| . . . | then ac(7r™, . . . ,7 t™) := i?™ 

end cases 

The restrictions on the form of the conditions and results ensure that only 
one result is specified for each item of the domain of the elementary access being 
modified. Counter-examples justifying these restrictions are given in [19]. 

Elementary accesses can be used to specify defined modifiers. Defined modi- 
fiers are specified by compositions of elementary modifiers and defined modifiers. 
The compositions are 

— Conditional composition of the following form: 

begin 0i then Em\ \ . . . | 4>p then Emp end 
indicating that a modification expression Emi is chosen if its condition 
is valid. If several conditions fi are valid, the modification expression with 
the smallest index is chosen. 

Note: This form of modification is different from the conditional elementary 
modifier in two ways: the Enrii are any modification expressions and there 
are no universally quantified variables. 

— Sequential composition, mi ; m 2 , meaning that the execution of mi is followed 
by that of m 2 - 

— Casually independent eomposition, m\ and m 2 , indicating any sequential 
composition of mi and m 2 . The order of execution of mi and m 2 is unim- 
portant. 

— Simultaneous composition, mi •m 2 , where the modifications specified by mi 
and m 2 are applied to the same state. If mi and m 2 specify a modification of 
the same access function, they must change it at different points; otherwise, 
only the modification mi is taken into account. 

This list does not aim at being minimal. Actually, some constructs overlap 
in some cases. It aims to provide a convenient way of specifying complex state 
modifications, without worrying about details such as intermediate results or 
order of execution when they are not relevant to the specification. 

Thus defined modifiers are declared with a profile which states the sorts of 
their arguments, and their effect on the state is described by a modification 
expression. 
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Example. 

switchSpeedLimits : Speed 

switchSpeedLimits{As) = Vs : Section, 

[Local SpeedLimit{s) := Local SpeedLimit{s) — Zis] 

Defined modifiers and access functions may be exported by a system speci- 
fication. When using a system specification, only the exported features can be 
mentioned. This ensures some encapsulation of the implicit state. 

An AS-IS specification also contains a set of axioms Axinit which specifies 
possible initial states of the specified system. The behaviors of the system are 
sequences of exported instantiated modifiers, i. e. exported defined modifiers 
with ground terms as arguments, or elementary modifiers of exported accesses 
with parameters either quantified or instantiated by ground terms. A reachable 
state of the system is either an initial state, or the resulting state of an exported 
instantiated modifier applied to a reachable state. 

An example of a system specification is given below. It is a drastic (and thus 
unrealistic) simplification of the specification presented in [10]. 

The specified system can progress, with a measured speed, during an in- 
terval of time At, or the speed limits of the sections can be changed via the 
switchSpeedLimits modification, or an emergency stop can occur. 

The progress modification is the most complex one. It checks that the speed 
limit is respected. If it is not, an emergency stop occurs, and if it is, the system 
deals with a possible section change, chooses an acceleration which depends on 
the current speed (this choice is not specified here), and computes the next 
position of the train. 

system TRAIN export progress, emergency Stop, switchSpeedLimits 
use UNITS, % defines the sorts Abscissa, Speed,, and Acceleration 
% and some constants of these sorts 

SECTION % defines the Section sort 
elementary accesses 

Current Section : — ^ Section, 

Local SpeedLimit : Section — ^ Speed, 

MeasuredSpeed : — ^ Speed, 

CurrentAbscissa : — ^ Abscissa, 

Current Acceleration : — ^ Acceleration 
accesses 

Current SpeedLimit : — > Speed, 

accesses axioms 

CurrentSpeedLimit = min{LocalSpeedLimit{CurrentSection), . . .) 

Init 

Current Section = sectionO, LocalSpeedLimit{s) = speedlimO, 
MeasuredSpeed = speedO, CurrentAbscissa = 0, 

CurrentAcceleration = accO, 
modifiers %declaration of some defined modifiers 
progress : Speed, 
emergency Stop, 
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switchSpeedLimits : Speed, 

sectionChange, 

accelerationChoice, 

modifiers definitions 

progress{s) = 

MeasuredSpeed := s and Current Abscissa := Next Abscissa ; 

begin 

Current Abscissa > length{CurrentSection) then sectionChange 

end ; 
begin 

MeasuredSpeed > CurrentSpeedLimit then emergency Stop \ 
MeasuredSpeed < CurrentSpeedLimit then 
accelerationChoice ; 

Next Abscissa := CurrentAbscissa+ 

{CurrentSpeed + Current Acceleration x At) x At, . . . 

end 

switchSpeedLimits (As) = Vs : Section, 

[Local SpeedLimit(s) := LocalSpeedLimit{s) — Zis] 
sectionChange = 

Current Section := next {Current Section) • 

Current Abscissa := Current Abscissa — length{Current Section) 

% NB : it is much more complex in reality . . . 
accelerationChoice = . . . 
emergency Stop = . . . 
end system 



3 Abstract State Machines 

3.1 Gurevich Abstract State Machines 

Abstract State Machines (ASMs), originally known as evolving algebras, have 
been proposed by Gurevich [25] as a framework for the formal definition of 
the operational semantics of programming languages. During the last decade 
many real-life programming languages and many complex algorithms including 
communication protocols and hardware designs have been defined as ASMs (the 
first complete description of the evolving algebra approach is contained in [28] , 
the annotated bibliography of the majority of papers in the field can be found in 
[5], for the most recent developments look at http://www.eecs.umich.edu/gasm/). 

The success of the approach can be attributed to two reasons: (1) sound 
mathematical background and (2) imperative specification style. The imperative 
nature of evolving algebras has led to the introduction of a new term for them, 
Abstract State Machines (the terms Gurevich Abstract State Machines or just 
Gurevich Machines are also in use). The latest version of ASM is described in 
[29] which is used as the main reference source in this section. 
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ASMs are based on the notion of a universal algebraic structure consisting of 
a set, a number of functions, and a number of relations. Such a structure serves 
for the representation of the state. The underlying set is called a super-universe 
and can be subdivided into universes by means of unary relations. A universe 
serves to model a data type (in fact, the set of data type values). 

There are a number of transition rules indicating in which way a state can 
be converted into another state of the same signature. Normally, this is done by 
a slight change of a function. For this reason, functions can be either statie or 
dynamic. A static function never changes, a change of a dynamic function pro- 
duces a new state. Another means of state modification is changing the number 
of elements in the underlying set (importing new elements). 

Only total functions are used in Gurevich ASM. A distinguished super- 
universe element undef is used to convert a partial function into a total one. 
Thus, every r-ary function / is defined on every r-tuple d of elements of the 
super-universe, but they say that / is undefined at an d if /(a) = undef; the set 
of tuples d with /(a) ^ undef is called the domain of /. 

The other two distinguished super-universe elements are true and false. The 
interpretation of an r-ary predicate (relation name) C/, defined on the whole 
super-universe, with values in {true, false} is viewed as a set of r-tuples a such 
that U{d) = true. If relation U is unary, it can be viewed as a universe. 

The vocabulary (signature) of any ASM contains the names of the above three 
distinguished elements, the name of the universe Boole defined as {true, false}, 
the names of the usual Boolean operations interpreted conventionally, and the 
equality sign interpreted as the identity relation on the super-universe. All the 
functions corresponding to the above names are static. 

Example. The vocabulary for oriented trees contains a unary predicate 
Nodes and unary function names Parent, FirstChild, and NextSibling. An ori- 
ented tree with n nodes gives rise to a state with n + 3 elements: in addition to 
n nodes, the super-universe contains the obligatory elements true, false, undef 
The universe Nodes contains the n nodes. 

For the interpretation of transition rules, the notions of location, update, and 
update set are introduced. A location in a state A is a pair I — (/, a), where / is 
a function name of arity r and d is an r-tuple of elements of A. In the case that 
/ is nullary, (/, ()) is abbreviated to /. 

Example. Assume that we have an oriented tree and let a be a node, then 
some locations are {Parent, a), {FirstChild, a), {NextSibling, a) 

An update in a state A is a pair a = {I, b), where I = (/, a) is a location in 
A and b is an element of A. To update the state A using a (“to fire a at A”), it 
is necessary to “put b into the location F , i.e. convert A into a new algebra B 
so that f^{d) = b. The other locations remain intact. 

Example. Assume again that we have an oriented tree and let a, b be any 
two nodes, then some updates are {{Parent, a), b), {{FirstChild, b), a). 

An update set over a state A is a set of updates of A. An update set 7 is 
consistent if no two updates in 7 clash, i.e. there are no two {li,bi) and (^2,^2) 
such that li = I2 but b\ ^ 62- To update the state A using a consistent 7, it is 
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necessary to “fire all its updates simultaneously” . The state does not change if 
the update set is inconsistent. 

The main transition rule (or simply “rule” in the sequel) called update rule 
has the following form: 

/(s) := t, 

where / is the name of a function of arity r, s is a tuple of terms, 

and t is a term. The interpretation of this rule in a state A causes an update 
a = where a = (s(^, ..., 

Example. Assume that c and p are terms denoting two nodes of an oriented 
tree. Then the transition rule 
parent{s) := p 

interpreted in a state A by the update {{parent, s"^),p"^) will transform A in i? 
so that parent^ {s^) = p"^ and the other locations remain intact. 

A conditional rule having the form 

if g then R\ else R2 endif, where g is a, Boolean term and are 

rules, causes the execution of either i?i or R2 depending on whether g is true or 
false. 

Another basic rule is a bloek constructed as follows: 

do in- parallel R\,...,Rn enddo, 

where R\,...,Rn are rules. The block rule is interpreted by an update set con- 
sisting of updates produced by interpretations of i?i, ..., Rn- The state does not 
change of course if the update set is inconsistent. 

The last basic rule is an import rule having the following form: 
import V R{v) endimport, 

where v is an identifier and R{v) is a rule using this identifier as a free variable. 
The interpretation of this rule in a state A causes the extension of its basic set 
(super-universe) with a new element a and the subsequent interpretation of R 
with V bound to a. It is supposed that different imports produce different reserve 
elements. For example, the interpretation of the block 
do in- parallel 

import V Parent{v) := c endimport 
import V Parent{v) := c endimport 
enddo 

creates two children of node c. 

There are several extensions of the set of basic rules. A try rule of the form 

try i?i else R2 endif 

permits some form of exception handling, i.e., the rule R2 is executed only if R\ 
is inconsistent. 

A nondeterministic choose rule of the form 
choose V : g{v) R{v) endchoose, 

where v is an indentifier, and g{v) and R{v) are, respectively, a Boolean term 
and a rule both using u as a free variable, causes the execution of R only for 
some one element of the superuniverse satisfying g. This means that, if there are 
several superuniverse elements such that g evaluates to true for v bound to any 
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of them, then nondeterministically one of them is chosen and R is executed with 
V bound to this element. 

Finally, a do-forall rule of the form 

do forall v : g{v) R{v) enddo 

causes the executon of R for any superuniverse element bound to v and satis- 
fying g. In this way the quantification of elementary modifiers and conditional 
elementary modifiers of AS-IS is generalised to any transition rule. 

Several abbreviation conventions introduce some syntactic sugar permitting 
to flatten enclosed conditional rules and omit the “else” part when it is not 
necessary, to import several elements in an import rule, combine try and block 
rules, etc. 

It is important to note that, in contrast to AS-IS described in the previous 
section, no effort is made to ensure that any two function updates do not update 
the same function at the same point, all possible inconsistences are resolved at 
the level of update set as described above. That’s why the quantification can be 
applied here to any transition rule. 

To conclude this short review of GASM, we reproduce (using the syntax 
described) the specification of a stack machine given in [26]. 

The stack machine computes expressions given in reverse Polish notation, or 
RPN. It is supposed that the RPN expression is given in the form of a list where 
each entry denotes a number or an operation. The stack machine reads one entry 
of the list at a time. If the entry denotes a number, it is pushed onto the stack. 
If the entry denotes an operation, the machine pops two items from the stack, 
applies the operation and pushes the result onto the stack. At the beginning, 
the stack is empty. It is supposed that the desired ASM has universes Data for 
the set of numbers and Oper for the set of bynary operations on Data. Argl 
and Arg2 are distinguished elements of Data. To handle operations in Oper, the 
ASM has a ternary function Apply such that Apply{f, x, y) = f{x, y) for all / in 
Oper and all x,y in Data. 

To handle the input, the ASM has a universe List of all lists composed of 
data and operations. The basic functions Head and Tail have a usual meaning. 
If L is a list, then Head{L) is the first element of L and Tail{L) is the remaining 
list. F is a distinguished list initially containing the input. Finally, the ASM 
has a universe Stack of all stacks of data with the usual operations Push, Pop, 
and Top. S is a, distinguished stack initially empty. With these explanations, the 
specification of the algorithm looks as follows: 

if Data{H ead{F)) = true then 
do in-parallel S := Push{Head{F), S) 

F := Tail{F) 

enddo, 

endif 

if Oper{H ead{F)) = true then 
if Argl = undef then 
do in-parallel 
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Argl := Top{S) % Argl is defined now 
S := Pop{S) 

enddo 

elseif Arg2 = undef then 
do in-parallel 

Arg2 := Top{S) % Arg2 is defined now 
S := Pop{S) 

enddo 

else 

do in-parallel 

S := Push{Apply{Head{F),Argl,Arg2),S) 
F := TaifiF) 

Argl := undef % Argl is undefined now 
Arg2 := undef % Arg2 is undefined now 

enddo 

endif 



3.2 Typed Abstract State Machines 

The above example clearly indicates some shortcomings of Gurevich ASMs. The 
first of them is the absence of a formal definition of the static part of the state. 
Therefore, it is defined in plain words (universes Data, Stack, List, and Open, 
operations Head, Tail, etc.). This is typical of ASM. When writing a specifica- 
tion, one can write the signature of any function operating with values of one or 
more universes. One cannot, however, define formally the semantics of a static 
function or a sufficiently large set of values of a particular universe. It is assumed 
that the behavior of all static functions is either well known or defined by some 
external tools; in the majority of cases, the same refers to universes (one can 
make sure of this, looking at the definition of C [27] where almost all static 
functions and universes are defined in plain words) . 

The second shortcoming is the actual absence of a type system: one cannot 
construct arbitrary data types and functions with a well-defined semantics and 
either one has to use a small number of well-known data types like Boolean, 
Integer, etc. or one has to define informally needed data types and functions. The 
results of this shortcoming are well-known: neither an appropriate structuring of 
the data of an application nor type checking of a specification is possible. At the 
same time, a big specification like a big program is error-prone and type checking 
helps to detect many errors at the earliest state of the specification development. 
For example, the following error could be done in the above specification: 

S := Tail{F) 

Unfortunately, no formal tool is able to detect this error, and it can be only 
debugged with the use of a concrete input in the process of its interpretation if 
an interpreter is developed. 

For these reasons several attempts have been done to introduce typing in 
ASMs. The first proposal is described in [39] and its modification in [40]. An 
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Oberon compiler is fully specified with the use of the method [41], A distin- 
guished feature of the approach is the actual proposal of a specification mecha- 
nism incorporating the advantages of both many-sorted algebraic specifications 
and ASMs. The main idea behind the choice of basic specification constructs has 
been to use the notions most familiar to the programming community. Another 
task has been avoidance of any other logic except the first-order many-sorted 
logic which is most familiar to the computer scientists. 

As a result, universes are replaced with data types for which the semantics 
can be formally defined by means of algebraic equations. The mechanism pro- 
vides means for defining both concrete data types and type constructors (generic, 
or parameterized data types). Some popular data types and type constructors 
are built-in (these are enumeration type, record type and union type construc- 
tors). Data type operations are defined together with the corresponding sort in 
a so called data type specification. In addition, independent static functions (i.e. 
functions not attributed to particular data types) can be specified with the use 
of data type operations. 

The set of transition rules proposed in the approach is mainly based on the set 
of basic rules of [28]. There is, however, an important difference in the treatment 
of the assignment of an undefined value to a location. There cannot be a single 
wnde/value for all data types. To simplify the specification of data types, no one 
of them is equipped with its own undef value. Partial functions are used instead, 
and a definedness predicate, D, is introduced. For each term t, the predication 
Dft) holds in a given algebra A if t is defined in it and does not hold otherwise. 
In an update rule 

/(ti, ...,tn) ■■= undef 

undef is just a keyword indicating that /(ti, ...,tn) becomes undefined. 

For the interpretation of such a construction, another algebra update, /? is 
introduced in addition to a described above. An update f3 is just a location. To 
update the state A using /?, it is necessary to convert A into a new algebra B so 
that the content of the location is undefined. The other locations remain intact. 

The other main additions are sequence constructor and a tagcase constructor 
resembling, respectively, a compound statement and a tagcase statement of some 
programming languages. The need for a sequential rule constructor has arisen in 
several practical applications and is noted in [4,22]. They are also part of AS-IS, 
as described in the Section 2. The tagease rule constructor is needed when union 
types are used. It has the following form: 

tagcase u of Ti : Ri, T2 : R2, ■■■, Tk : Rk endtag 
where u is a term of type Union{Ti,T 2 , ...,Tn), Ri, R 2 , ■■., Rk are rules, and 
k <= n. In the interpretaion of the rule, the component type of u is compared 
with Ti, ...Tk. If the component type is R, then Ri is executed regarding u as a 
term of type R. Thus, the tagcase constructor permits us to manipulate a union 
type value as a value of the type needed (this facility is not provided by the 
conditional constructor) . 

To demonstrate the facilities of the approach, we rewrite the previous exam- 
ple of a stack machine. Notation: the data type signature is enclosed in square 
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brackets, the axioms are enclosed in curly brackets, the symbol inside the 
data type signature denotes the type being specified. 

type Oper = (’+’, % enumeration type 

type Doper = Union(Nat, Oper); %union type 

type Stack(T: TYPE) = spec 
[empty: 

push: T, @ — > 
pop: @ — ^ 
top: @ — > T]; 

{axioms are conventional} 

type List(T: TYPE) = spec 
[empty: 

append: T, @ — ^ 
head: @ — ^ T; 
length: @ — > Nat; 
tail: @ — > @; 
has: @, T — > Boolean; 
is_empty: @ — > Boolean] 

{axioms are conventional} 

dynamic const S: Stack(Nat) = empty; initially empty stack 
dynamic const Argl, Arg2: Nat; % initially undefined constants 
dynamic const F: List(Doper); % initialized by a demon 

tagcase head(F) of 

Nat: do in-parallel S := push(head(F), S), F := tail( F) enddo, 

Oper: 

if -iD(Argl) then % if Argl is undefined 

do in-parallel Argl := top(S), S := pop(S) enddo 
elseif -iD(Arg2) then % if Arg2 is undefined 

do in-parallel Arg2 := top(S), S := pop(S) enddo 
else 

do in-parallel S := push(apply(head(F), Argl, Arg2), S), 

F := tail(F), 

Argl := undef, % Argl is undefined now 
Arg2 := undef % Arg2 is undefined now 

enddo 

endif 

endtag 

Note that all the operations used in the example are now formally defined in 
contrast to the previous version of the example. Moreover, a type checker can 
easily detect an error like the previous one and even one like the following one 
(which cannot be detected if a conditional rule were used): 
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tagcase head(F) of 

Oper: do in-parallel S := push(head(F), S), F := tail(F) enddo, 



The other innovations of the approach are dependent functions and procedures 
(defined modifiers) resembling the corresponding constructs of AS-IS. However, 
their semantics, as it is defined in [43], is quite different. It will be explained in 
the next section. 

There is no import rule, of course. In a typed environment where each algebra 
element is denoted by (at least one) ground term, it would be strange to manip- 
ulate unreachable elements in addition. Some technique of the specification of 
the operations as dynamic functions could help to solve the problem, but these 
complications do not seem necessary. Structures like sets or lists can be used to 
achieve the goal. 

Another proposal for typed ASMs is contained in [11]. In contrast to the ap- 
proach discussed above, this approach does not confine the user to the algebraic 
style of defining data types. Only general guidelines of a simple type system 
introducing parametric polymorphism as suggested in [34] are given. The inter- 
pretation of data type is also left abstract. The only requirement is that every 
closed type is interpreted as a set. The set of rules is borrowed from [29] with 
the exception of the import rule which, of course, is not needed in a typed envi- 
ronment. There is no construct corresponding to dependent function or defined 
modifier of AS-IS. 

Object-oriented ASMs as a kind of typed ASMs are introduced in [42]. In 
addition to a number of data types, such an ASM uses a number of object types. 
While a data type defines a set of values and a set of operations, an object 
type defines a set of object behaviors. An object possesses a unique identifier 
and a number of methods subdivided in attributes (correspond to dynamic func- 
tions), observers (correspond to dependent functions) and mutators (correspond 
to modifiers). The tuple of attribute values defines the object’s state. 

For a given object type, different system’s states can possess different num- 
bers of objects with different object’s states. An object’s state can be updated 
with the use of a mutator. For creating new objects of type T, the import rule 
of Gurevich ASMs in the form new{T) is reinvented. Note that this reinvention 
does not violate the term generation principle mentioned above since there is no 
basic term generating an object identifier (remember that an object type defines 
a set of object behaviors rather than a set of object identifiers!). 

Object types are specified with the use of transition rules. Here is a exam- 
ple of it (method profiles and method calls are written like in object-oriented 
programming languages, the other notation is like that one used in data type 
specifications, two parts of an axiom are related by the symbol “==”): 

class Rectangle = spec 

[mutator default .rectangle; % setting a default reetangle’s state 
create: Nat, Nat; % setting a new rectangle’s state 
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attribute length, width: Nat; % rectangle attributes definig the state 
observer area: Nat; % computing a rectangle’s area 

equal: Rectangle — > Boolean; % comparison of rectangles for equality] 

{forall r, rl: Rectangle, x, y: Nat. 

r.default_rectangle == do in-parallel r. length := 0, r. width := 0 enddo; 
r.create(x, y) == do in-parallel r. length := x, r. width := y enddo; 
r.area == r. length * r. width; 

r.equal(rl) == r. length = rl. length & r. width = rl. width}; 

Note the specification methodology: each mutator is defined in terms of a 
transition rule setting values of object’s attributes, and each observer is defined 
by a conventional axiom. 

Another version of Object-oriented ASMs permitting late binding of methods 
is described in [44]. 

4 Dynamic Systems with Implicit State 

4.1 Notion of Dynamic System 

The convergence of the works on AS-IS and typed ASM has eventually led to 
the notion of dynamic systems which is based on the state-as-algebra concept 
and formalizes state updates as operations on algebras [20,43]. 

Let U be a “static” signature introducing a number of data types, Ueac a 
signature of elementary access functions, Uac a signature of dependent access 
functions, and Umod a signature of modifiers. Then a dynamic system, D{A), of 
signature < if, ifeac, ifac, ^mod >, where A is a if-algebra, is defined as a 3-uple 
with: 

- carrier |Ii(A)| which is a set of (if U ifeac)-algebras with the same if-algebra 

4 , 

- some set of dependent access funetions with names and profiles defined in Uac, 

- some set of defined modifiers with names and profiles defined in Umod- 

A dependent access function name ac : si,...,s„ s is interpreted in a 
dynamic system D{A) by a map ac^^^^ associating with each D(A)-algebra 
A' (i.e., an algebra belonging to the carrier of D{A)) a function ac^^"^\A') : 
4'i X ... X ^ A'. 

The operation associated with a defined modifier of Smod is a transformation 
of a Il(A)-algebra into another Il(A)-algebra. 

4.2 Specification of a Dynamic System 

Let DS < {S, Ax), {Afac, Axinit), {A’ac, Axac, A’mod, Def^^^) > be a dynamic 
system specification. It has three levels: 

1. The first level is a classical algebraic specification < if. Ax > (cf. [12,38]) 
which defines the data types used in the system. Semantics of this specifica- 
tion is given by the specification language used. The approach is relatively 
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independent of a particular specification language. It is only required that 
the semantics of a specification is a class of algebras. 

2. The second level defines those aspects of the system’s state which are likely 
to change and the initial states. It includes: 

— A signature, Ueaa which does not introduce new sorts. It defines the 
the names and profiles of elementary access functions. A model of the 
specification < A’ U Ueac,Ax > is a state. In the sequel, U' stands for 
A U 

— A set of axioms, Axinit, characterizing the admissible initial states, i. e. 
stating the initial properties of the system. 

3. The third level defines some dependent access functions and possible evolu- 
tions of the system’s states. Two parts are distinguished here. 

— A specification of dependent access functions < Uac,Axac >■ It does not 
introduce new sorts and uses the elementary access functions and the 
operations of A. The form of this specification is the same as in AS- 
IS. However, the semantics is different (see the preceding subsection) in 
order to simplify the semantics of state updates. 

A D(A)-algebra A' can be extended into an algebra A" of signature U" = 
A'U A’ac satisfying Axac- Such an algebra is called an extended state. The 
extended state corresponding to the state A' is denoted by Ext^'i{A') in 
the sequel. Given a A’'-algebra A' and its extended state A” , any ground 
term of Tjj" corresponds to a value in A' since the specification of A" 
does not introduce new sorts and is sufficiently complete with respect to 
the specification of A' (cf. Section 2). Thus, the notion of the value of a 
ground A"-term in a I?(A)-algebra A' can be used. 

— A definition of defined modifiers, < Smod, Def^^^^ >. The form of this 
specification is the same as in Section 2. 

As sketched above, a modifier name mod : from Smod is in- 

terpreted in a dynamic system D{A) by a map mod^^^^ associating a 
I?(A)-algebra B with each pair < A',< v\,...,Vn », where A' is a 
I?(A)-algebra and Vi is an element of A(.; this map must satisfy the 
corresponding definition from as stated in [20]. 

This approach gives a semantics of modifications themselves, indepen- 
dently of their applications. Moreover, the fact that the dependent ac- 
cesses are no more part of the state makes the semantics of elementary 
updates much simpler [20]. 



4.3 States and Behaviors of the System 

The notions of state and behavior introduced in Section 2 are redefined below 
for dynamic systems. 

Let DS — A (A, Ax'j , (Ag^c? Axjnn), (Aqo Axac^ ^modi Befjriodf) A be a spec- 
ification of a dynamic system, and let A’' = A’ U Agac . 

System’s state. As already mentioned, a state of the system, defined by the 
specification DS is a A'-algebra satisfying the axioms Ax. 
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It is important that each change of state preserves the data types used. This 
leads to the partitioning of < U',Ax >-algebras into subsets, state a{^' , Ax), 
consisting of algebras sharing the same interpretation of the data types. Since 

< S' ,Ax > is just an extension of the specification < S,Ax > with some oper- 
ation names, we have : 

{\JstateA{S' ,Ax))AaAig{E,Ax) = Alg{S',Ax) 

Initial states. A subset of this set of models represents possible initial states of 
the system being specified. It corresponds to an enrichment of the specification 

< S', Ax > with Axinit, thus: 

stateinit(DS) = {A' e Alg{S',Ax >)|A' |= Axinit} 

Behavior of the system. A behavior is a sequence of updates which are pro- 
duced by the invocations of some modifiers. Several sequences of states 
(eo, ei, 62 , ...) correspond to a behavior (mo, mi, m 2 , ...) depending on the choice 
of the initial state: 



— the initial state eo belongs to state init{DS)] 

— each 6 i+i is the result of the application of the modifier rrii to ei (ei+i = 

|ml6i). 



The semantics of updates as it is defined in [20] guarantees that if eo belongs to 
a dynamic system D{A), then any e^ also belongs to D{A) (the state changes, 
but the data types do not change). 

As AS-IS, this formalism is deterministic for two reasons: the semantics of 
elementary modifiers and, therefore, of all modifiers ensures that one^ and only 
one state (up to isomorphism) is associated with the application of a modifier to 
a state; besides the specification of dependent access functions, < Sac,Axac >, 
is sufficiently complete with respect to < A U Seac,Ax >. Thus, only one se- 
quence of states starting with a given initial state is associated with a behavior. 

Reachable states. The set of reachable states, REACH{DS) is the set of 
states which can be obtained by a sequence of updates corresponding to the 
invocations of some modifiers of Smod, starting from an initial state. 

Thus, the set REACH(DS) is recursively defined in the following way: 

- stateinit{DS) C REACH{DS) 

- Vm e Aw, Vti e ■■■tne VA' e REACH(DS), 

lm{ti,...,tn)jA' e REACH{DS). 



2 



provided that the validity /invalidity of the conditions in conditional updates is always 
defined 
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5 Related Works 

One of the first specification languages with states represented by algebras is 
COLD-K [18], the kernel language of the COLD family of specification languages. 
It possesses many of the features mentioned above, e.g. dynamic (elementary 
access) functions, dependent (access) functions and procedures (modifiers). Pro- 
cedures are considered as relations on states. For the specification purposes some 
imperative constructions (sequential composition expressions and repetition ex- 
pressions) are used. However, it is still mainly axiomatic specification language 
using pre- and post-conditions resembling those of VDM. 

The idea of implicit state in terms of a new mathematical structure, d-oid, 
is given by Astesiano and Zucca [2]. A d-oid, like the dynamic system described 
above, is a set of algebras (states) called instant structures, set of dynamic oper- 
ations (transformations of instant structures with a possible result of a definite 
sort) and a tracking map indicating relationships between instant structures. 
Dynamic operations in a d-oid serve as counterparts of dependent access func- 
tions and modifiers in AS-IS and the tracking map provides a very abstract way 
of identifying components of different instant structures (there is no notion of 
tracking map in the above definition of dynamic system since each algebra of the 
same signature is by definition a mapping of the same set of names to a seman- 
tics universe). The approach in question deals only with models and does not 
address the issue of specifying the class of such behaviors, which is the purpose 
of imperative specifications. 

Dynamic types as a modified version of d-oid are further investigated in [45]. 
Although no direct definition of a dynamic abstract type is given in that paper, 
it has contributed by formal definitions of a static framework and of a dynamic 
framework with a corresponding logical formalism over a given static framework. 
It seems that the formalism can be used as a basis of an imperative specification 
language. 

Another similar approach is the “Concurrent State Transformation on Ab- 
stract Data Types” presented in [23] . It also uses the idea of implicit state which 
is modeled as partial algebra that extends a fixed partial algebra considered as 
a static data type. All functions are given at the same level. Dynamic func- 
tions are considered totally undefined in the static data type. A state on a given 
partial algebra is a free extension of this algebra, specified by a set of function 
entries. Invariant relations between dynamic operations are given by axioms at 
the static level. Transitions between states are specified by conditional replace- 
ment rules indicating the function entries that should be added/removed when 
the condition is valid. 

There are some restrictions on the partial equational specifications for the 
static data types, the admissible partial algebras and states, and the replacement 
rules in order to have the same structural properties as the algebraic specifica- 
tion logic. The most severe of them is the restriction of replacement rules only 
to redefinitions of so called contents functions corresponding to the mappings of 
variables to their values in programming languages. This leads to severe restric- 
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tions on the use of the formalism (one cannot define and update an arbitrary 
dynamic function). 

In a slightly revised form the formalism is used in [24] for the definition of 
algebra transformation systems and their compositions. 

Algebra updating operations are interpreted as relations between algebras in 
[3], and these relations are specified by the usual algebraic specification tech- 
nique. To make the difference between the original and updated values of the 
same function (constant), one has to decorate its name in a formula. This leads 
to the necessity of having two signatures (one for the original algebra and one 
for the resulting algebra) and signature morphisms for establishing the corre- 
spondence between decorated and non-decorated versions of the same name and 
writing formulae in the discriminated union of the signatures. From some exam- 
ples of the paper, it seems that this can lead to rather complex specihcations. 

Finally, the specihcation language Troll [30] should be mentioned. It is ori- 
ented on the specihcation of static and dynamic properties of objects where a 
method (event) is specihed by means of evaluation rules resembling equations 
on attribute values. Although the semantics of Troll is given rather informally, 
there is a strong mathematical foundation of its dialect, Troll light [21], with the 
use of data algebras, attribute algebras and event algebras. An attribute algebra 
represents a state. A relation constructed on two sets of attribute algebras and 
a set of event algebra, called object community, formalizes a transition from one 
attribute algebra to another when a particular event algebra takes place. 



6 Conclusion 

This survey paper provides a guided tour of several imperative approaches of 
specihcation based on the state-as-algebra paradigm. Section 4 sketches a syn- 
thesis of two of them, under the name of dynamic systems with implicit state. 

Some of these approaches differ in signihcant way. This is an indication of the 
generality of the paradigm. In AS-IS, the aim is to specify the dynamic evolutions 
of the specihed systems in a high level and non algorithmic way. In ASM, the 
goal is to provide a way of describing algorithms in an abstract way. Moreover, 
the problem of multiple inconsistent updates is considered very differently in 
both approaches, as mentioned in Section 3.1. 

One of the advantages of these approaches to formal specihcation is a better 
understandability for people familiar with imperative programming. AIS use a 
simple syntax which can be read as a form of high level code. 

Another advantage is their generality. AIS have been shown to be useful in 
such wide variety of domains as sequential, parallel and distributed systems with 
either hnite-state or inhnite domains. 

A current weakness of these approaches is the lack of formal calculus to per- 
form proofs. It is very likely that a calculus based on the concept of substitution, 
in the line of Abrial’s calculus for B [1] could be developed. It is the subject of 
some future work. 
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Abstract. A semi-visual framework for the specification of syntax and 
semantics of imperative programming languages, called Montages, was 
proposed in an earlier work by the authors. The primary aim of this 
formalism is to assist in recording the decisions taken by the designer 
during the language design process. The associated tool Gem-Mex allows 
the designer to maintain the specification and to inspect the semantics 
to verify whether the design decisions have been properly formalized. 
Experience with full-scale case studies on Oberon, Java, and domain 
specific languages showed the close relationship to Finite State Machines 
(FSMs). This paper gives a new definition of Montages based on FSMs. It 
confers to the formalism enhanced pragmatic qualities, such as writabif- 
ity, extensibility, readability, and, in general, ease of maintenance. 



1 Introduction 

The aim of Montages is to document formally the decisions taken during the 
design process of realistic programming languages. Syntax, static and dynamic 
semantics are given in a uniform and coherent way by means of semi-visual 
descriptions. The static aspects of a language are diagrammatic descriptions of 
control flow graphs, and the overall specifications are similar in structure, length, 
and complexity to those found in common language manuals. 

The departure point for our work has been the formal specification of the 
C language [10]^, which showed how the state-based formalism Abstract State 
Machines [8, 9, 13] (ASMs), formerly called Evolving Algebras, is well-suited for 
the formal description of the dynamic behavior of full-blown practical languages. 
In essence, ASMs constitute a formalism in which a state is updated in discrete 
time steps. Unlike most state-based systems, the state is given by an algebra that 
is a collection of functions and universes. The state transitions are given by rules 
that update functions pointwise and extend universes with new elements. The 
model presented in [10] describes the dynamic semantics of the C language by 
presuming on an explicit representation of control and data flow as a graph. This 

^ Historically the G case-study was preceded by work on Pascal [8], and other lan- 
guages, see [5] for a commented bibliography on ASM case studies. 
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represents a major limitation for such a model, since the control and data flow 
graph is a crucial part of the specification. Therefore, we developed Montages 
which extend the approach in [10] by introducing a mapping which describes 
how to obtain the control and data flow graph starting from the abstract syntax 
tree. 

The formulation of Montages [17] was strongly influenced by some case stud- 
ies [16, 18] where the object-oriented language Oberon [26] has been specified. 
Montages have been used also in other case studies, such as the specification 
of the Java [25] language, the front-end for correct compiler construction [11], 
and the design and prototyping of a domain-specific languages in an industrial 
context [19]. The experience showed that the underlying model for the dynamic 
semantics, namely the specification of a control flow graph including conditional 
control flow and data flow arrows and its close relationship to the well known 
concept of Finite State Machines, shortens the learning curve considerably. In 
this paper a new FSM based definition of Montages is given. Complete references, 
documentation and tools can be obtained via [4], 

2 Montages 

In our formalism, the specification of a language consists of several components. 
As depicted in Fig. 1, the language specification is partitioned into three parts. 



language specification language instances 




Fig. 1. Relationship between language specification and instances 
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1. The EBNF production rules are used for the context-free syntax of the spec- 
ified language L, and they allow to generate a parser for programs of L. 
Furthermore, the rules define in a canonical way the signature of abstract 
syntax trees (ASTs) and how the parsed programs are mapped into an AST. 
Section 2.1 contains the details of this mapping. In Fig. 1 the dotted arrow 
from the EBNF rules indicates that this information is provided from the 
Montage language specification. 

2. The next part of the specification is given using the Montage Visual Language 
(MVL) . MVL has been explicitly devised to extend EBNF rules to finite state 
machines (FSM). A MVL description associated to an EBNF rule defines 
basically a loeal finite state machine and contains information how this FSM 
is plugged into the global FSM via an inductive decoration of the abstract 
syntax trees. To this end, each node is decorated with a copy of the finite 
state machine fragment given by its Montage. The reference to descendents 
in the AST defines an inductive construction of a global structured FSM. In 
Section 2.2 we define how this construction works exactly. 

3. Finally, any state of the FSM may be associated with an Abstract State 
Machine (ASM) rule. This aetion rule is fired when the corresponding state 
is reached. As shown in Fig. 1, the specification of these rules is the third 
part of a language specification. 

The complete language specification is structured in specification modules, 
called Montages. Each Montage is a “BNE-extension-to-semantics” in the sense 
that it specifies the context-free grammar rule (by means of EBNF), the (local) 
finite state machine (by means of MVL), and the dynamic semantics of the 
construct (by means of ASMs). The special form of EBNF rules allowed in a 
specification and the definition of Montages lead to the fact that each node in 
the abstract syntax tree belongs exactly to one Montage. 

As an example the Montage for a nonterminal with name Sum is shown in 
Fig. 2. The topmost parts of this Montages is the production rule defining the 
context-free syntax. The remaining part defines static aspects of the construct 
given by means of an MVL description. Additionally, the Montage contains an 
action rule, which is evaluated when the FSM reaches the add state. 



Sum ::= Factor Expr 



S-Factor 



S-Expr 



©add: 

value := S-Factor. value -h S-Expr. value 



EBNF 



MVL description 

(local finite state machine) 



ASM 

transition rule 



Fig. 2. Montage components 
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The definition of Montages usually contains a fourth section which is devoted 
to the specification of static analysis and semantics. After working with fixed 
traversal orders and non-local attributions, we found that Reference Attribute 
Grammars [12] are most suited for our purpose, since they do not restrict the 
use of non-local references. The result of the attribution can be used to define 
firing conditions and actions in the FSM. 

The use of attribute grammars for static analysis and semantics is a standard 
technique. In [12] it is shown how reference attribute grammars define static 
properties of an object oriented languages in a simple and concise way. Further 
[22] uses a corresponding functional system in combination with ASMs and shows 
how to describe static and dynamic aspects of full-blown languages. In contrast to 
these works. Montages has an elaborated visual formalism for the specification 
of sequential control flow by means of FSMs. These aspects are going to be 
presented in the next sections. 



2.1 Prom Syntax to AST 

In this section, the first step in Fig. 1 is described. As a result of this step 
we get the abstract syntax tree of the specified program. But we also compose 
the Montages corresponding to the different constructs of the language. This 
composition of the partial specifications is done based on the structure of the 
AST. 

EBNF rules. The syntax of the specified language is given by the collection of 
all EBNF rules. Without loss of generality, we assume that the rules are given 
in one of the two following forms: 



A:~BCD (1) 

E = F\G\H (2) 

The first form defines that A has the components B, C, and D whereas the 
second form defines that E is one of the alternatives E, G, or H. Rules of the first 
form are called characteristic productions and rules of the second form are called 
synonym productions. We guarantee that each non-terminal symbol appears in 
exactly one rule as the left-hand-side. Non-terminal symbols appearing on the 
left of the first form of rules are called characteristic symbols and those appearing 
on the left of synonym productions are called synonym symbols. 

Composition of Montages. Each characteristic symbol and certain terminal 
symbols define a Montage. A Montage is considered to be a class‘d whose in- 
stances are associated to the corresponding nodes in the abstract syntax tree. 

^ In this context we consider class to be a special kind of abstract data type, having 
attributes and methods (actions) and, most important for us, where the notion of 
sub-typing and inheritance are predefined in the usual way. 
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Symbols in the right-hand side of a characteristic EBNF rule are called (direct) 
components of the Montage, and symbols which are reachable as components of 
components are called indirect components. In order to access descendants of a 
given node in the abstract syntax tree, statically defined attributes are provided. 
Such attributes are called selectors and they are unambiguously defined by the 
EBNF rule. In the above given rule, the B, C, and D components of an A in- 
stance can be retrieved by the selectors S-B, S-C, and S-D. In Fig. 3 a possible 
representation of the A-Montage as class and an abstract syntax tree (AST) 
with two instances of A and their components are depicted. 




Fig. 3. Montage class A, instances in the AST, selectors S-B, S-C, S-D 

Synonym rules introduce synonym classes and define subtype relations. The 
symbols on the right-hand-side of a synonym rule can be further synonym classes 
or Montage classes. Each class on the right-hand-side is a subtype of the intro- 
duced synonym class. Thus, each instance of one of the classes on the right-hand 
side is an instance of the synonym class on the left-hand-side, e.g. in the given 
example, all F-, G-, and H-instances are E-instances as well. In the AST, each 
inner node is an an instance of arbitrarily many (possibly zero) synonym classes 
and of exactly one Montage. 

Terminals, e.g. identifiers or numbers, do not correspond to Montages. The 
micro-syntax can be accessed using an attribute Name from the corresponding 
leaf node. The described treatment of characteristic and synonym productions 
allows for an automatic generation of AST from the concrete syntax given by 
EBNF, see also the work in [21]. 

Induced structures. Inside a Montage class, the term self denotes the current 
instance of the class. Using the selectors, and knowledge about the AST, we can 
build paths w.r.t. to self. For instance, the path self. S-B. S-H.S-J denotes a node 
of class J, which can be reached by following the selectors S-B, S-H, and then 
S-J, see Fig. 4. The use of such a path in a Montage definition imposes a number 
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of constraints on the other EBNF rules of the language. The example self.S-B.S- 
H.S-J requires that there is a B component in the Montage containing the path. 
Further, every subtype of B must have an H component, and every subtype of 
H must have an J component. In other words, the path self.S-B.S-H.S-J must 
exist in all possible ASTs. 




Fig. 4. Montage A using path self.S-B.S-H.S-J, situation in AST, and constraints 
on EBNF rules of B, H 



Example. As a running example we give a small language S. The expressions 
in this language potentially have side effects and must be evaluated from left 
to right. The atomic factors are integer constants and variables of type integer. 
The start symbol of the EBNF is Expr, and the remaining rules are 



Expr 

Sum 

Factor 

Variable 

Constant 



Sum I Factor 
Factor “-h” Expr 
Variable | Constant 
Went 
Digits 



The following term is an 5-program: 
2 + X + 1 



As a result of the generation of the AST we obtain the structure represented 
in Fig. 5. In particular, the nodes from 1 to 8 represent instances of the Montage 
classes and the edges point to the successors of a particular node. The edges 
are labeled with the selector functions which can be used in the Montage corre- 
sponding to the source node to access the Montage corresponding to the target 
node. The nodes themselves show the class hierarchy starting from the synonym 
class and ending with the Montage class. The leaf nodes contain the definition 
of the attribute Name, i.e. the micro-syntax. 
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Fig. 5. The abstract syntax tree and composition of Montages for 2 + x + 1 



2.2 Prom AST to Control Flow Graphs 

According to Fig. 1, the next step in building the data structure for the dynamic 
execution is the inductive decoration of the AST with a number of finite state 
machines. Again, this process is described rather informally here. 

As we have seen in Fig. 2, the second part of a Montage contains the neces- 
sary specifications given in form of the Montage Visual Language (MVL). The 
Montages for the productions Variable and Constant are given in Fig. 6. Two 
kinds of information are represented in the second part of a Montage: (a) the 
local state machine to be associated to the node of the AST and (b) informa- 
tion on the embedding of this local state machine. Using our running example, 
Fig. 7 just represents the MVL sections of the Montages as they are associated 
to the corresponding nodes of the abstract syntax tree. The hierarchical state 
transition graph resulting from the inductive decoration is shown in Fig. 8 for 
the running example. 



Variable ::= Ident 


Constant ::= Digits 


I— 


I — 


©lookup: 
value := 

CurrentStore(S-Ident.Name) 


©setValue: 

value ;= S-Digits.Name 



Fig. 6. The Montages for the language S 
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Fig. 7. The finite state machines belonging to the nodes 



Montage Visual Language. Now, the elements of the MVL and their seman- 
tics can be described as follows: 

— There are two kinds of nodes. The oval nodes represent states in the gener- 
ated finite state machine. These states are associated to the AST node cor- 
responding to the Montages. The oval nodes are labeled with an attribute. It 
serves to identify the state, for example if it is the target of a state transition 
or if it points to a dynamic action rule. 

— The rectangular nodes or boxes represent symbols in the right hand side of 
the EBNF rule and are called direct components of a Montages, see Section 
2.1. They are labeled with the corresponding selector function. Boxes may 
contain other boxes which represent indirect components. This way, paths 
in the AST are represented graphically. 

— The dotted arrows are called control arrows. They correspond to edges in 
the hierarchical state transition graph of the generated finite state machine. 
Their source or target can be any box or oval. In addition, their source or 
target can be either the symbol / (/ stands for initial) or T (T stands for 
terminal), respectively. In a Montage, at most one symbol of each, I and T, 
is allowed. If the / symbol is omitted, the states of the Montage can only be 
reached using a jump, if the T symbol is omitted, the Montages can only be 
left using a jump. 

— As in other state machine formalisms (such as Harel’s StateCharts), pred- 
icates can be associated to control arrows. They are simply terms in the 
underlying ASM formalism and are evaluated after executing the action rule 
associated to the source node. Predicates must not be associated to control 
arrows with source I. 

— There are additional notations not used in this paper — for example data 
flow edges representing the mutual access of data between Montages and box 
structures representing lists in an effective way. Moreover, in this section of a 
Montage, one may specify further action rules to be performed in the static 
analysis phase, for example building up data structures necessary for the 
static and dynamic semantics. 
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Fig. 8. The constructed hierarchical finite state machine 



It remains to show how the hierarchical finite state machine, for example 
Fig. 8 is built and how its dynamic semantic is defined. 

Hierarchical FSM. Building the hierarchical FSM is particularly simple. The 
boxes in the MVL are references to the corresponding local state transition 
graphs. Remember that nested boxes correspond to paths in the AST. Therefore, 
there are references to children only, i.e. to other state transition graphs along 
the edges of the AST. After resolving the references, a representation as in Fig. 8 
is obtained. 

The obtained hierarchical FSM gives the dynamic semantics of the parsed 
program. Direct execution of the hierarchical FSM is possible. Like in State- 
Charts a hierarchical state is entered at its initial state. This state is marked 
with an /-arrow. If a final state (marked with a T-arrow) is reached, the hier- 
archy is followed upwards. 

Flat FSM. Alternatively, the FSM can be flattened. For this purpose the arrows 
from I and to the T symbols define two unary functions. Initial and Terminal 
denoting for each node in the AST the first, respectively last state that is visited. 
According to the semantics of hierarchical FSMs, the inductive definition of these 
functions is given over the FSM states and the boxes representing instances of 
Montages. 

For each state s in the finite state machines, 

s. Initial = s (3) 

s. Terminal = s (4) 

and for each instance n of a Montage N whose MVL-graph has an edge from I 
to a component denoted by path tgt, 

n. Initial = n. tgt. Initial 

and for each instance m of a Montage M whose MVL-graph has an edge from a 
component denoted by path sre to T, 

m. Terminal = m. sre. Terminal 

Using these definitions, the structured FSM is flattened by replacing each 
edge e from s to f with an edge from s. Terminal to t. Initial. After this re- 
placement the boxes are not related to the FSM anymore. Nevertheless, the 
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information of the AST nodes related to source, target, and definition of an ar- 
row are needed for the evaluation of advanced features usable in the the firing 
conditions. 




Fig. 9. The flat finite state machine and its relation to the AST 



Applying the replacement to the running example results in the flat state 
machine of Fig. 9. In the same figure the dotted lines denote the relation of 
a state to its corresponding Montage instance, which is accessible as self. The 
relation of edges to the instances corresponding to source, target and definition 
is not given, since our example has no interesting conditions. 

Given the flat FSM as in Fig. 9, we can track the behavior given by the 
actions in Figs. 2 and 6. The initial state is the leftmost setValue state. Its 
rule updates the additional attribute value with the constant stored in the field 
Name of the digits-token. After the action rule is executed, the firing conditions 
of outgoing arrows are evaluated. In our example there is only one arrow with 
the default condition true exists. Visiting the states sequentially the action rules 
are executed one after another. The most interesting is the add action. It accesses 
the values of its arguments using the selectors S-Faetor, S-Expr and defines its 
own value field to be the sum of the arguments. 

Assuming that CurrentStore maps x to 4, the execution of the flat or struc- 
tured finite state machine sets the value of node two to the constant 2, sets the 
value of node five to the current store at x, sets the value of node six to 1, sets 
the value of node three to the sum of 4 and 1, and finally sets the value of node 
one to the sum of 2 and 5. 

3 Gem-Mex: The Development Environment for 
Montages 

The development environment for Montages is given by the Gem-Mex tool [2, 3]. 
The intended use of the tool Gem-Mex is, on one hand to allow the designer to 
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‘debug’ her/his semantics descriptions by empirical testing of whether the in- 
tended decisions have been properly formalized; on the other hand, to automat- 
ically generate a correct (prototype) implementation of programming languages 
from the description, including visualization and debugging facilities. 

Gem-Mex is a system which assists the designer in a number of activities 
related with the language life cycle, from early design to routine programmer 
usage. It consists of a number of interconnected components 

— a specialized graphical editor allows to enter and manipulate Montages in a 
convenient way; 

— frames for the documentation of the specified languages are generated auto- 
matically; 

— the Montages executable generator (Mex) generates a correct and efficient 
interpreter of the language; 

— the generic animation and debugger tool visualizes the static and dynamic 
behavior of the specified language at a symbolic level; source programs writ- 
ten in the specified language and user-defined data structures can be ani- 
mated and inspected in a visual environment. 



3.1 Generation of Language Interpreters 

Using the formal semantics description given by the set of Montages and a num- 
ber of ADTs, the Gem-Mex system generates an interpreter for the specified lan- 
guage. The core of the Gem-Mex system is Aslan [1], which stands for Abstract 
State Machine Language and provides a fully-fledged implementation of the 
ASM approach. Aslan can also be used as a stand-alone, general purpose ASM 
implementation. The process of generating an executable interpreter consists of 
two phases: 

— The Montages containing the language definition are transformed to an in- 
termediate format and then translated to an ASM formalization according 
to the rules presented in the previous Sections. 

— The resulting ASM formalization is processed by the Aslan compiler gener- 
ating an executable version of the formalization, which represents an inter- 
preter implementing the formal semantics description of the specified lan- 
guage. 

Using Aslan as the core of the Gem-Mex system provides the user the possibility 
to exploit the full power of the ASM framework to enrich the graphical ASM 
macros provided by Montages with additional formalization code. 

3.2 Generation of Visual Programming Environments 

Besides pure language interpreters, the Gem-Mex system is able to generate 
visual programming environments for the generated ASM formalization of the 
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programming language semantics^. This is done by providing a generic debugging 
and animation component which can be accessed by the generated executable. 
During the translation process of the Montages/ ASM code special instructions 
are inserted that provide the information being necessary to visualize the exe- 
cution of the formalization. In particular, the visual environment can be used 
to debug the specification, animate the execution of it, and generate documents 
representing snapshots of the visualization of data structures during the execu- 
tion. The debugging features include stepwise execution, textual representation 
of ASM data structures, definition of break points, interactive term evaluation, 
and re-play of executions. 

3.3 Library of Programming Language Features 

A concept for providing libraries of programming language features is currently 
under development. With this concept it shall be possible to reuse features of 
programming languages that have already been specified in other Montages. Ex- 
amples for this kind of features are arithmetic expressions, recursive function call, 
exception handling, parameter passing techniques, standard control features etc. 
The designer of a new language can then import such a feature and customize it 
according to his or her needs. The customization may range from the substitu- 
tion of keywords up to the selection among a set of variants for a certain feature, 
like different kinds of inheritance in object-oriented languages, for example. In 
the Verifix project [11], a number of reusable Montages has been defined with the 
intention to reuse not only the Montages but as well an associated construction 
scheme for correct compilers. 

4 Related Work 

Denotational semantics has been regarded as the most promising approach for 
the semantic description of programming languages. But its problems with the 
pragmatics have been discovered already in case studies of the scale of Pascal and 
C [23]. Moreover domain definitions often need to be changed when extending 
the language with unforeseen constructs, for instance a change from the direct 
style to the continuation style when adding gotos [20]. 

Other well known meta-languages for specifying languages are Natural Se- 
mantics [14], ASF+SDF [24], and Action Semantics [20]. For somebody knowing 
mathematical logic. Natural Semantics are pretty intuitive and we used it for 
the dynamic semantics of Oberon [15]. Although we succeeded due to the excel- 
lent tool support by Centaur [7], the result was much longer and more complex 
than the Montages counterpart given in [18], since one has to carry around all 
the state information in the case of Natural Semantics. Similar problems exist 
if ASF+SDF is applied to imperative languages. Action Semantics solves these 

® This feature is again available to all kind of ASM formalizations implemented in 
Aslan not only to those generated from a Montages language specification 




52 



Matthias Anlauff, Philipp W. Kutter, and Alfonso Pierantonio 



problems by providing standard solutions to the main concepts used in pro- 
gramming languages. Unfortunately the set of standard solutions is not easily 
extendible. 

Using ASMs for dynamic semantics, the work in [22] defines a framework 
comparable to ours. For the static part, it proposes occurrence algebras which 
integrate term algebras and context free grammars by providing terms for all 
nodes of all possible derivation trees. This allows such an approach to define all 
static aspects of the language in a functional algebraic system. Since reference 
attribute grammars [12] correspond to occurrence algebras the static aspects of 
our formalisms are almost identical to those in [22]. 

None of the discussed approaches uses visual descriptions of control flow 
and none of them supports structuring of all specification aspects in a vertical 
way, e.g. in self-contained modules for each language construct. This way of 
structuring is novel with respect to existing frameworks, as far as we know. In 
combination with refinements of involved semantic functions, and renaming of 
the vocabulary, it allows to reuse large parts of language specifications directly 
in other specifications. Programming language specifications can be presented as 
a series of sub-languages, each reusing its predecessor and extending it with new 
features. This specification structure has been used in ASM case studies [6, 10] 
and was adapted to the Montages case study of Oberon [18]. Our experience 
with Montages shows, that such sub-languages are useful, working languages, 
that can be executed, tested, and explained to the user in order to facilitate 
understanding of the whole language. The design and prototyping of a language 
is much more productive if such a stepwise development and testing is possible. 
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Abstract. Software architecture is widely recognized as one of the most 
fundamental concepts in software engineering, because of the fact, that 
today’s software systems are assembled from components with different 
characteristics: for example heterogenous, legacy or distributed systems. 
At the software architecture level, designers combine subsystems into 
complete systems using different techniques, e.g. “Architecture Descrip- 
tion Languages” (ADLs). There exists a number of ADLs, each of which 
is specialized for one or more architectural styles. They are designed by 
different research groups with different goals in mind corresponding to 
their mental model on how software architecture can be expressed in the 
most efficient and elegant way. As a result, ADLs are not compatible 
with each other, so that it is difficult to present a homogeneous view of 
the software architecture of a system assembled from different compo- 
nents. This paper presents an approach how architectural styles can be 
combined using a concept of ADL-interchange. 



1 Introduction 

The complexity of many of today’s software developments makes it often not 
reasonable to fix a certain architectural style for the design process of the whole 
software system. The need for multiple styles can come from either the problem 
domain or the subparts used to construct the system. Imagine, while designing 
a mobile phone network station, there are several architectural styles, that need 
to be combined. For example for receiving signals from the mobile phone the 
architect may choose a streaming pipe-and-filter style to handle the constant 
flow of repetitive data. For processing signals may be an event-based style is 
chosen. For interacting with the user, an event-based style “plus” a pipe-and- 
filter style is chosen. For that part of the subsystem which is responsible for 
the collection of independent components or special customer service queries, a 
repository-based approach is chosen. 
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Problem statement: These high level descriptions and the “plus” between these 
styles sound attractive on paper, but while composing different architectural 
styles, architects may rely on ad hoc methods in trusting their own personal 
experiences. Current practice tackles the component composition problem on 
the technical layer using e. g. scripting, broker, RPC, event channels or similar 
approaches. These approaches have strong emphasis on solving technical inter- 
action problems. The realization of the overall problem specification is covered 
by these low-level problems. 

Arehiteeture Description Languages (ADLs) belong to the high-level approa- 
ches. ADLs are intended to describe the system structure and behavior at a 
sufficiently abstract level dealing with large and complex systems [6]. A lot of 
work has been done in this research area, e.g. Aesop[10], Unicon[ll], ControlH[9], 
MetaH[5], Rapide[14], Darwin[15], iT[19], UNAS[18], Wright[l], GenVoca[4]. 

But the heterogeneity of today’s software systems forces to use different com- 
ponents described in different ADLs. This leads to the situation, that the ADLs 
become nearly unuseful, because each ADL operates in a stand alone fashion, 
they are not interoperable. In large or heterogenous systems many of these com- 
mon aspects of architectural design support are re-implemented afresh. This 
means a lot of unnecessary work, which is probably one of the reasons for the 
often discussed question [6] in software architecture, why ADLs are only taken 
as early life-cycle specification languages. A main reason for this interoperabil- 
ity is especially the underlying semantics of the architectural descriptions. For 
instance, the notion of a component in ADL A could be a different one as the 
component notion in ADL B. 

In the following, we will discuss two aspects: how to use different ADLs in 
a large software system and how to perform the composition task on the archi- 
tectural level. The basic idea is, that there exists an interchange level between 
the different architectural description means. Therefore we introduce a serviee 
layer as a platform and common service representation layer for the compo- 
nent composition (see Figure 1): The system description S and the components 
Cl,. . .,Cn are mapped to corresponding ASM descriptions S',C{ . . . ,C^ which 
must be consistent with their unmapped versions. This mapping can be done by 
using standard techniques, like language translation, or the definition of adaptors 
and wrappers. Finally, as the most challenging task, we transform the overall 
system specification S' step by step in such a way that it finally contains explicit 
references to the interfaces of the existing components C[ . . . ,C4. An example 
of this kind of transformation is the use of refinement techniques in formal de- 
scription methods [3]. The result of these stepwise transformations represents 
the composition specification of the system. We will call this final specification 
5*+ in order to emphasize that it realizes the “sum” of the components. Finally, 
we can now analyze the resulting specification 5*+ aiming at the identification 
of new eomponents X that need to be developed besides the existing ones. As 
a side-effect, the specification of these newly identified components can then 
automatically be obtained from the specification 5*+ and developed accordingly. 




56 



Asuman Siinbiil 



System Spccrfication Component ttescription 




Fig. 1. Service layer for the composition of components 



2 Why Mapping Architectural Descriptions to Service 
Layer Representation? 

We argue, that the combination of different ADLs during the design of a system 
is useful at least because of the following reasons: 

— If an architectural description problem is best solved by a certain ADL A, 
then the use of this A is the most natural thing even if for other parts of the 
system A is not appropriate, and therefore other ADLs are used. 

— Developers often have individual favorites for describing the architecture of 
software. If a developer has the freedom to choose the ADL that he or she 
wants - if it is appropriate for the description of the problem - than his or 
her productivity is much higher than if he or she is forced to use an ADL that 
is fixed by the project policy. Often these “favorite” ADLs are none of the 
well known languages from literature, but individually designed “languages” 
the semantics of which is normally given by an implicit agreement among 
the members of a developer team. 

Therefore, what is needed is the possibility to combine different ADLs so 
that 

- for different portions and/or aspects of the software architecture the ADLs 
that fits best can be used and 

- the resulting combined architectural description is semantically consis- 
tent w.r.t. the underlying models of the ADLs. 

A promising way to solve this problem is to provide a concept for an in- 
terchange of ADLs. In principle, there exist the following alternatives for an 
interchange between different ADLs: 

Defining a union language subsuming all the capabilities of the existing lan- 
guages. This approach seems to be unrealistic because of the manifold char- 
acteristics of existing ADLs. There cannot be “the universal ADL language” 
that masters every requirement and every domain specific using of ADL. 
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Defining an intersection language that incorporates the features being con- 
tained in each of the ADLs. 

Defining a service ’’interchange” providing services to describe and com- 
position problems based on architectural descriptions. 

The former two alternatives imply that the use of existing ADLs would be 
restricted, because they would be at least partially replaced by a new ADL. The 
experience gathered for the “old” ADLs will be lost and users are forced to learn 
a new language. The advantage of the third approach is, that existing ADLs can 
be used as they are, because the interchange is not done on the language level 
but on a basic semantic description level. Thus, the latter approach is much more 
promising, because there is no need to convince people in present and in future to 
use a “better” approach for their architectural description. For the same reason, 
the third alternative applies also for the integration of the architecture of legacy 
systems. 

2.1 What Are the Problems Concerning ADL Interchange? 

The combination of ADL A and B is less complicated, if A and B are designed 
to describe different aspects of the software. For example, if the static structures 
of the system is described in ADL A, and the dynamic behavior is encoded in 
ADL B, then the combination of these two descriptions should be easier. The 
situation looks quite different, if A and B are “competing” ADLs being designed 
for similar purposes. In this case, it is very important to earefully analyze the 
underlying semantics of A and B, so that a combination is possible and the 
consistency can be checked. 

Therefore, an interplay of ADLs can only be achieved, if the semantics of each 
of them is unambiguously defined. Only with these descriptions it is possible to 
formulate propositions being valid for the combined architectural description. 

As a consequence, there must be one single description language for formulat- 
ing the underlying semantics of each of the ADLs. Thus, great care must be taken 
in selecting the right one, which must meet at least the following requirements: 

— Due to the fact that ADLs are manifold, the formal description language 
must be universal in that sense, that it is possible to describe the feature 
of existing ADLs. Especially, the language must be able to express static 
structures as well as dynamic behavior. 

— In order to be able to make statements about certain properties of the com- 
bined architectural description (e.g. consistency, liveness) the description 
language must have a well-defined mathematical basis. 

— If during the process of combining ADLs is turns out, that aspects being 
important for the interplay of the ADLs are not expressible by any of the 
participating ADLs, the description language should as well be usable as 
an alternative ADL in order to insert missing parts in the architectural 
description. 

— The previous item implies, that the description language must be intelligible 
for people involved in the software architecture. 
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— Due to the fact, that ADLs often describe large and complex software system, 
the underlying description language must be scalable. 

— From a practical point of view, the description language should have a no- 
tion of execution, so that support tools can directly generate code that im- 
plements the interchange level. 

The Abstract State Machine approach [12] seems to be a promising candidate 
for being used to describe software architecture models and the semantics of 
ADLs: 

— Abstract State Machines (ASM) is an universal, mathematically well-founded 
method which is capable of the description of static structures as well as dy- 
namic behavior of system. 

— ASMs provide the possibility to choose appropriate levels of abstraction ac- 
cording to the problem that should be described. This feature is also impor- 
tant with respect to scalability. 

— ASM have been used for many different problem areas. In the context of this 
work, the use of ASMs in describing the semantics of programming languages 
(e.g. [13] ) and computer architecture (e.g. [7] ) provides an excellent basis 
for the task of describing software architectures. 

— ASM can as well be executed; there exist several tools that generate exe- 
cutable versions of the ASM specification. 

The aim of this approach is neither the development of a new architecture 
definition language, nor a prescription of a common vocabulary, nor the genera- 
tion of “architectural theorems” . The aim is to form a basis for the combination 
of ADLs by using existing work and building a low level concept that can directly 
be used to implement the interchange of ADLs. 

3 Scenario: Description of the Composition 

We assume in the following, that we’re working within the service layer. It serves 
as a platform for the architectural description and component composition. It 
abstracts from architectural styles that have been used to originally describe 
the components. However, the translation from the original description to the 
representation used in the service layer must be carried out in a way, that no 
semantic information gets lost The format of component description used in 
the service layer is very similar to the U ([19]) component model, containing 
the services being provided and required by a component and additionally the 
specification of the functionality and dynamic behavior of the component in 
ASM-notation. 

For the following description we revisit the example of Section 1. The follow- 
ing composition problem is described as an example: the “processing signals” 
component needs information from the “data management” component in or- 
der to decide whether a phone connection can be established or not, because of 
potentially existing limitations contained in the contract of a customer of the 
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phone company. In order to combine these two models, we translate each of them 
into an ASM formalization. The union of these formalizations then forms the in- 
terchange level where the architectural composition can actually be performed. 

As pointed out in the previous section we use ASMs for this purpose. At 
the first step, the architectural descriptions of the example mentioned in Section 
1 are automatically translated into an ASM description using techniques like 
Montages [2] . The ASM formalization of the data structures of the “processing 
signal” component is given as follows: 

universes 

Connect ionDat a 

Process = {Receive, Connection, 

Timer , Disconnection} 

ProcessState = {active , passive} 
functions 

state: Process->ProcessState 
CurrentConnection: ->Connect ionDat a 
Connection Process : -> Process 
relation 

access_check: ConnectionData->Booleaai 

For the “data management” component, the following relation is needed for 
describing the composition: 

relation 

checkAccess : ConnectionData->BooleeLn 

Using this data structures, the composition of the two components can be spec- 
ified as follows: 

if state (ConnectionProcess) = active then 
if not access_check(CurrentConnection) 
then 

access_check (CurrentConnection) : = 

checkAccess (CurrentConnection) 
elseif access_check(CurrentConnection) then 
! ConnectionProcessRules 
else 

error := "no access granted" 
state (ConnectionProcess) := passive 
endif 
endif 

In the next steps, these abstract description must be stepwise refined until a layer 
is reached where concrete system access based on the technical description if the 
interface can be modeled. These refinement steps are omitted in this example. 

4 Related Work 

Currently, the only effort that has been undertaken to build an interchange of 
ADLs is the Acme approach [16] which is still under development. Acme is a 
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software architecture description language that aims at providing a common in- 
terchange format for software architecture, so that different tools for architecture 
description languages can be integrated. The main difference between the ap- 
proach presented in this work and Acme can be described as follows: Acme’s goal 
is the convergence of all ADL related research activities into the Acme frame- 
work and tries to form an interchange between ADLs on the language level. Our 
approach retains existing ADLs by pushing the interchange activities on a lower 
level, the semantic description level of these ADLs. 

5 Conclusion 

Based on the fact that architectural design fragments using different architec- 
tural description means often need to be combined into larger architectures this 
paper presents a concept how to compose different architectural styles. This will 
be achieved by providing an interchange level for architectural composition. 
This work is focusing on ADLs and provides basic concepts for the composition 
based on ADLs. In contrast to existing approaches for combining ADLs, the idea 
presented here does not build on a consensus between ADL developers in present 
and future, because neither a superset nor an intersection of existing ADLs need 
to be introduced. Following our approach, the composition keeps the freedom of 
choosing the architectural description means, that is most suitable for the actual 
problem. The choice of an ADL is not restricted by the needs of the composition 
task. 
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Abstract. An abstract framework is developed to describe program 
transformation by specializing a given program to a restricted set of 
inputs. Particular cases include partial evaluation [19] and Turchin’s 
more powerful “driving” transformation [33]. Such automatic program 
speedups have been seen to give quite significant speedups in practical 
applications. 

This paper’s aims are similar to those of [18]: better to understand the 
fundamental mathematical phenomena that make such speedups pos- 
sible. The current paper is more complete than [18], since it precisely 
formulates correctness of code generation; and more powerful, since it 
includes program optimizations not achievable by simple partial evalua- 
tion. Moreover, for the first time it puts Turchin’s driving methodology 
on a solid semantic foundation which is not tied to any particular pro- 
gramming language or data structure. 

This paper is dedicated to Satoru Takasu with thanks for good advice 
early in my career on how to do research, and for insight into how to see 
the essential part of a new problem. 



1 Introduction 

1.1 History 

Automatic program specialization evolved independently at several different 
times and places [13,31,33,5,11,20]. In recent years partial evaluation has re- 
ceived much attention ([19,6], and several conferences), and work has been done 
on other automatic transformations including Wadler’s well-known deforestation 
[37,7,26]. 

Many of these active research themes were anticipated in the 1970’s by 
Valentin Turchin in Moscow [29,30] in his research on supereompilation (= su- 
pervised computation and compilation), and experiments were made with imple- 
mentations. Examples include program optimization both by deforestation and 
by partial evaluation; the use and significance of self-application for generating 

* This work was supported in part by the Danish Natural Science Research Council 
(DART project) and by an Esprit Basic Research Action (Semantique). 
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compilers and other program generators; and the use of grammars as a tool in 
program transformation [31,32,17]. Recent works on driving and supercompila- 
tion include [33,14,15,27,24,22,1,36]. 

1.2 Goals 

The purpose of this paper is to formulate the essential concepts of supercom- 
pilation in an abstract and language-independent way. For simplicity we treat 
only imperative programs, and intentionally do not make explicit the nature of 
either commands or the store, except as needed for examples. 

At the core of supercompilation is the program transformation called driving 
(Russian “progonka”). In principle driving is stronger than both deforestation 
and partial evaluation [27,37,12,19], and an example will be given to show this 
(the pattern matching example at the end of the paper). On the other hand, 
driving has taken longer to come into practical use than either deforestation or 
partial evaluation, for several reasons. 

First, the greater strength of driving makes it correspondingly harder to 
tame; cause and effect are less easily understood than in deforestation and par- 
tial evaluation, and in fact it is only in the latter case that self-application has 
been achieved on practical applications. Second, the first papers were in Russian, 
and they and later ones used a computer language RefaR unfamiliar to western 
readers. Finally, the presentation style of the supercompilation papers is un- 
familiar, using examples and sketches of algorithms rather than mathematical 
formulations of the basic ideas, and avoiding even set theory for philosophical 
reasons [34]. 

We hope the abstract framework will lead to greater practical exploitation of 
the principles underlying supercompilation (stronger program transformations, 
more automatic systems, new languages), and a better understanding in principle 
of the difficult problem of ensuring termination of program transformation. 

1.3 Preliminary Definitions 

First, a quite abstract definition of an imperative program is given, as a state 
transition system. In our opinion the essence of the “driving” concept is more 
clearly exposed at this level. Later, a more intuitive flow chart formalism will be 
used for examples, and to clarify the problem of code generation. 

Definition 1. An abstract program is a quadruple tv = (P, S, — >,Po) where po G 
P and ^ C {P X S) X {P X S) . Terminology: P is the set o/ program points, S is 
the set o/ stores, — > is the transition relation, and po is the initial program point. 
We write —> in infix notation, e.g. (p,s) {p' , s') instead of {{p, s), {p' , s')) G 

A state is a pair {p, s) £ P x S. 

^ Refal is essentially a language of Markov algorithms extended with variablesand 
two kinds of brackets to create tree structures. A program is a sequence of rewrite 
rules, used to transform data in the form of associative and possibly nested symbol 
strings. In contrast with most pattern matching languages, most general unifiers do 
not always exist. 
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A store such as [Xi— > 1:2: [] , Y i— > 2 : (4:5) : [] ] usually maps program variables 
to their values. A program point may be a flow chart node, or can be thought 
of as a label in a program. 

Definition 2. p e P is transient if{p, si) ^ (p', s') and {p, S 2 ) ^ (p", s") imply 
p' = p" , i.e. there is at most one p' with (p, _) ^ (p', _). State (p, s) is terminal 
*/ (Pi s) ^ (p', s') holds for no (p', s'). The abstraet program tt is deterministic 
if for all states (p, s), (p, s) ^ (p',s') and {p,s) {p",s") imply p' = p" and 



Definition 3. A computation (from Sq E S) is a finite or infinite sequence 
{Po, So) (Pi, si) ^ (P2, S 2 ) ^ • 

Notation: subsets of S will be indicated by overlines, so s C S. Given this, and 
defining to be the reflexive transitive closure of the input/output relation 
that 7T defines on sq C S' is 

IO{tv,so) = {{so,st) I So e So, (po,so) (pt,st), and (pt,st) is terminal} 

More concretely, programs can be given by flow charts whose edges are labeled 
by commands. These are interpreted by a command semantics: 

C[[_]] : Command (S 

where Command and S are unspecifled sets (but S = the set of stores as above). 

Definition 4. A flow chart is a rooted, edge-labeled directed graph F = (P, E,po) 
where po E P and E C P x Command x P (the edges of P). We write p ^ p' 
whenever (p, C,p') G E. 

If p => p' then C denotes a store transformation, e.g. C could be an assignment 
statement changing a variable’s value. The formulation includes tests too: the 
domain of partial function C [[C]] is the set of stores which cause transition from 
program point p to p'. For example, command “if odd(X) goto” might label 
that edge, corresponding to “p: if odd(X) then goto p'” in concrete syntax. 

Definition 5. The program denoted by F is = (P, S, — >,po), where 
(p, s) ^ (p', s') if and only if s' = CJCJs for some p ^ p' 

2 Driven Programs, without Store Transformations 

A major use of driving (and partial evaluation) is for program specialization. For 
simplicity we begin with a rather weak form of driving that does not modify the 
store, and give a stronger version in the next section. 




The Essence of Program Transformation by Partial Evaluation and Driving 



65 



Given partial information about a program’s inputs (represented by a subset 
So C S' of all possible stores) , driving transforms program tt into another program 
TTd that is equivalent to tt on any initial store sq <E sq. The goal is efficiency: once 
TTd has been constructed, local optimizations of transition chain compression 
and reduced code generation can yield a much faster program than tt, as seen in 
[18,19] and many others. 

A useful principle is to begin by saying what is to be done, as simply as pos- 
sible, before giving constructions and algorithms saying how it can be accom- 
plished. We thus first define what it means for a program tt^ to be a “driven” 
form of program tt, and defer the question of how to perform driving to Section 4. 

Intuitively iVd is an “exploded” form of tt in which any of tt’s program points p 

may have several annotated versions (p, si), (p, S 2 ), Each Si is a set of stores, 

required always to contain the current store in any computation by TTd- 

Computations by Wd (state sequences) will be in a one-to-one correspondence 
with those of tt, so nothing may seem to have been gained (and something lost, 
since Wd may be bigger than tt). However, if control ever reaches an annotated 
program point (p, s) in TTd, then the current runtime store must lie in s. For 
example, s could be the set of all stores such that the value of variable X is 
always even. 

This information is the souree of all improvements gained by partial evalu- 
ation or driving. Its use is to optimize tt^ by generating equivalent but more 
efficient code exploiting the information given by s. In particular some compu- 
tations may be elided altogether, since their effect can be achieved by using the s 
at transformation time; and knowledge of s often allows a much more economical 
representation of the stores s G s. 



2.1 Abstract Formulation 

The following is, in our opinion, the essential core of the driving concept: 

Definition 6. Given program tv = (P, S, -^,Po), program ivd = {Pd, S, —^d, {Po, 
So)) is an So-driven form of tv if Pd Q P x P{S) and TVd satisfies the following 
conditions. 

1. ((p, s),s) -^d ((p^s'),s') and s G s imply {p,s) (p',s'). soundness 

2. (p, s) G Pd, (p, s) ^ (p', s'), and s G s imply that there exists s' such that 

((p, s),s) -^d Up', s'), s') completeness 

3. ((p, s), s) -^d Up', s'), s') and s G s imply s' G s' invariance of s es. 

To begin with, Pd Q P x P{S), so a program point of TVd is a pair (p, s) where 
s C S' is a set of stores. The soundness condition says that TVd can do only the 
store transformations that tv can do. The completeness condition says that for 
any driven program point (p, s) of TVd, any store transformation that tv can do 
from p on stores s G s can also be done by tt^. 

Programs may in principle be infinite, but in practice we are only interested 
in finite ones. 
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The Significance of Store Sets. The invariance of s G s in a transition 
{{p,s),s) -^d expresses a form of information propagation carried 

out at program transformation time [14,15]. 

One can think of a store set as a predicate describing variable value rela- 
tionships, e.g. “X is even” or “X = Y + 1 A Z < T”. Store sets could thus be 
manipulated in the form of logical formulas. 

This view has much in common with regarding statements as forward or 
backward predicate transformers, as used by Dijkstra and many others for prov- 
ing programs correct [10]. Further, a store set s that annotates a program point p 
corresponds to an invariant, i.e. a relationship among variable values that holds 
whenever control reaches point {p, s) in the transformed program. 

Instead of formulas, one could describe store sets using a set of abstract 
values X, using for example a function j : X ^ 'PiS) that maps an abstract 
value (T e X to the store set it denotes. In logic 7 is called an interpretation, and 
Turchin uses the term configuration for such a store set description [33]. 

This idea is a cornerstone of abstract interpretation, where 7 is called a 
concretization function [9,2,16]. Our approach can thus be described as program 
specialization by abstract interpretation. The abstract values are constructed “on 
the fly” during program transformation to create new specialized program points. 
This is in contrast to most abstract interpretations, which iterate until the ab- 
stract values associated with the original program’s program points reach their 
collective least fixpoint. 

Lemma 1. If tv d is an SQ-driven form of tt, then for any sq G sq there is a 
computation 



(PO,So) (Pl,Sl) (P2,S2) 
if and only if there is a computation 

(bo, So), So) ^ ((pi,Si),Si) ^ ((P2,S2),S2) ^ . 

Proof. “If” follows from soundness, “only if” by completeness and invariance of 
s G s. 

Corollary 1. IO{tv,So) = IO{TVd,SQ) 



Program Specialization by Driving. Informally, program tt is transformed 
as follows: 

1. Given tt and an initial set of stores sq to which tt is to be specialized, construct 
a driven program Tid. In practice, tt will be given in flow chart or other 
concrete syntactic form, and finite descriptions of store sets will be used. 

2. Improve iVd by and removing unreachable branches, and by compressing 
sequences of transient transitions 




The Essence of Program Transformation by Partial Evaluation and Driving 



67 



into single-step transitions 

3. If 7T = Tv^ where F is a given flow chart, then Fd is constructed and improved 
in the same way: by compressing transitions, and generating appropriately 
simplified commands as edge labels. 

The idea is that knowing a store set s gives contextual information used to 
transform Tid to make it run faster. Conditions for correct code generation will 
be given after we discuss the choice of store sets and the use of alternative store 
representations in Section 3. 



2.2 Extreme and Intermediate Cases 

In spite of the close correspondence between the computations of tt and wd, 
there is a wide latitude in the choice of iVd- Different choices will lead to different 
degrees of optimization. For practical use we need intermediate cases for which 
TTd has finitely many program points, and its store sets s are small enough (i.e. 
precise enough) to allow significant code optimization. 

We will see a pattern-matching example where a program with two inputs of 
size m, n that runs in time a ■ m ■ n can, by specializing to a fixed first input, be 
transformed into one running in time b ■ n where b is independent of m. 

One extreme case is to choose every s to be equal to S. In this case tt^ is 
identical to tt, so no speedup is gained. Another extreme is to define TTd to contain 
((p,s),s) -^d ((p',{s'}),s') whenever (p, s) ^ {p',s'), s e s, and (p,s) e Pd- 
In this case ivd amounts to a totally unfolded version containing all possible 
computations on inputs from sq. 



State Set Choice and Code Generation. The extreme just described will 
nearly always give infinite programs. It is not at all natural for code generation, 
as it deals with states one at a time. 

Cl 

In flow chart form, a test amounts to two different transitions p => p' and 
p => p” from the same p. A more interesting extreme can be obtained from 
the following principle: the driven program should contain no tests that are not 
present in the original program. The essence of this can be described without 
flow charts as follows. 

Definition 7. Tr^ requires no new tests if whenever tt contains {p, s) ^ {p', s'), 
s es, and wd contains {{p,s),s) -^d {{p',s'),s'), then 

s' 3 {s2 I 3si G s such that {p, si) ^ {p', S 2 ) is in tt} 

This defines the new store set s' to be inclusive, meaning that it contains every 
store reachable from any store in s by tt transitions from p to p' . The target 
store set s' of a driven transition ((p, s), s) -^d ((p^ s') includes not only the 
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target s' of s, but also the targets of all its “siblings” si G s that go from p to 

p'. 

For deterministic programs, this amounts to requiring that iTd can only per- 
form tests that are also performed by tt. This is a reasonable restriction for code 
generation purposes, but is by no means necessary: if one somehow knows that 
the value of a given variable x must lie in a finite set X = {a, b, ... ,k}, new tests 
could be generated to select specialized commands for each case of a; G X. 

Even though these new tests may seem unnecessary since they were not 
present in the original program, one often gains efficiency because the value of 
X will be known exactly in each of the specialized commands, leading to smaller 
subsequent code. See the discussion on “bounded static variation” in [19]. 

An So-driven form of tt can always be obtained by choosing equality rather 
than set containment for s', and choosing Wd to contain the smallest set of 
program points including (po>So) and closed under the definition above. This 
extreme preserves all possible information about the computation subject to the 
inclusiveness condition. It can be used in principle to produce a “most completely 
optimized” version of the given program, but suffers from two practical problems: 

First, this so-driven tt^ will very often contain infinitely many specialized 
program points {p,s). Second, its transition relation may not be computable. 



Generalization. It is a subtle problem in practice to guarantee that the trans- 
formed program both is finite and is more efficient than the original program. A 
solution in practice is not to work with the mathematically defined and usually 
infinite store sets above, but rather to use finite descriptions of perhaps larger 
sets s" A s' that can be manipulated by computable operations. 

Finiteness of the transformed program can be achieved by choosing describ- 
able store sets that are larger than s' but which are still small enough to allow 
significant optimizations. 

Turchin uses the term configuration for such a store set description, and 
generalization for the problem of choosing configurations to yield both finiteness 
and efficiency [33,35]. 

2.3 Driven Flow Charts 

We now reformulate the former abstract definition for flow charts. For now we 
leave commands unchanged, as Section 3 will discuss store modifications and 
code generation together. 

Definition 8. Given flow chart F = (P,E,po) and sq F S , Fd = {Pd, Ed, (po. 
So)) is an So-driven form of F if Pd Q P x P{S) and Fd satisfies the following 
conditions. 

1. {p,s) ^ (p^s') in Fd implies p p' in F soundness. 

c c 

2. {p,s) G Pd, s 7 ^ {}, and p ^ p' in F imply that {p,s) ^ {p' ,s') in Fd for 

some s' completeness. 




The Essence of Program Transformation by Partial Evaluation and Driving 



69 




Fig. 1. Diagram of a simple flow chart program 



3. {p,s) ^ ip', s') in Fd and s G s and s' = CJCJs is defined, imply s' G s' 
invariance of s Gs. 

Theorem 1. If Fd is an SQ-driven form of F, then is an SQ-driven form of 

7T. 



Proof. This is easily verified from Definitions 5 and 8, as the latter is entirely 
parallel to Definition 6. 

2.4 An Example 

Collatz’ problem in number theory amounts to determining whether the follow- 
ing program terminates for all positive n. To our knowledge it is still unsolved. 

A: while n yf 1 do 

B: if n even 

then (C: n := n -h 2; ) 
else (D: n := 3 * n + 1; ) 

fi 

od 

G: 

Its flow chart equivalent is F = {P,E,0) where P = {A, B,C, D,G} and edge 
set E is given by the diagram in Figure 1. The program has only one variable 
n, so a store set is essentially a set of values. 

We use just four store sets: 

Even = { [n H- > x] | a; G {0, 2, 4, . . .}} 

Odd = {[n H- > x] I a; G {1, 3, 5, . . .}} 

T = { [n H- > a;] I a; G Af} 

^ ={} 
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Fig. 2. A driven version of the same program 



The flow chart of Figure 2 is a driven version of F . Specialized program points 

{D, _L) and (G, _L) are unreachable since they have empty store sets. The driven 
version, though larger, contains two transient transitions, from (A, Even) and (B, 
Even). Transition compression redirects the branch from (U, Odd) to {C, Even) 
to give a somewhat better program, faster in that two tests are avoided whenever 
n becomes odd. 

3 Driven Programs, with Store Transformations 

According to Definition 6, a driven program iVd has exactly the same stores as 
7T. As a consequence the only real optimizations that can occur are from collaps- 
ing transient transition chains, and little computational optimisation occurs. We 
now revise this definition, “retyping” the store to obtain more powerful trans- 
formations such as those of partial evaluation by projections [19,18,21] or arity 
raising [25]. 

3.1 Abstract Formulation 

From now on Sd will denote the set of possible stores in driven program Wd- 
Given the knowledge that s G s, a store s of tt can often be represented in the 
driven program tt^ by a simpler store Sd G Sd- For example, if 
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s = {[X I— >1, Y I— Z H- >3]|y€ Af} 

then s G s at 7Td program point {p, s) can be represented by the value of Y alone 
since X, Z are known from context. In practice, s will be described finitely, e.g. 
by an abstract value a in description set U: 



a ■ 



[X I — ^ 1 , Y I — ^ T , Z I — ^ 3] . 



together with concretization function (or interpretation) 7 : if ^ 'P{S). To 
formalize this abstractly, we assume given a function 

Z\ : V{S) X Sd 5 



satisfying the following two properties (note that A is written in infix notation.): 



1. sAsd G s whenever s C S, Sd E Sd, and sAsd is defined; and 

2. soAsd = siAs'^ = s implies Sd = s'^ 

One can think of Z\ as a reconstruction function to build s from store set s 
and a driven store Sd- For example, if s is as above and if Sd is, say, \Y 1— > 5] 
then we would have 'sAsd = [X 1,Y 1— > 5, Z 1— > 3]. 

The restriction 'sAsd G s says that Sd can only represent a store in the current 
s. The second restriction says that A is injective in its second argument. 

The previous formulation without store transformations is expressible by 
putting S = Sd, and letting 'sAsd = Sd when Sd = Sd E s, with 'sAsd undefined 
otherwise. 

We will see that allowing alternative representations of the driven stores 
enables much stronger program optimizations. The new Definition 6 is as follows. 
The essential idea is that a transition 



{p, s) {p', s') = {p, sAsd) {p', s'As'j) 



is transformed, by a kind of reassociation, into a specialized transition of form 



{{p,s),Sd) -^d {{p',s'),s'd) 



Definition 9. Program TVd = {Pd,Sd,—^d, (po,so)) is an so-driven form of n = 
(P, S, — >,Po) in case Pd Q P x ’P{S) and Wd satisfies the following conditions. 

1. {{p,s),Sd) -^d {{p' ,s'),s'j) implies s = sAsd and s' = s' As'^ for some s,s', 

and (p,s) ^ {p',s'). soundness 

2. {p,s) G Pd, s G s, and (p,s) {p',s') imply there are Sd,s'j^,s' sueh that 

s = sAsd, s' = s'As'j^, and {{p,s),Sd) -^d {{p' ,s'),s'd)- completeness 



3. {{p,s),Sd) -^d ((p',s').Sd) imply s'As'^ G s' 



invariance 0 / s G s 
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Condition 3 is actually redundant, as it follows from 1 and the requirement on 

Z\. 

Lemma 2. If ivd is an so-driven form of tv, then for any computation 

{PO, So) {Pl,Sl) {P2,S2) 
with So = sol^sod there is a computation 

((PO, So), Sdo) -^d {{Pl,Sl), Sdl) -^d ((P 2 , S 2 ), Sd2) -^d ■ ■ ■ 

with Si = SiAsdi for all i. Further, for any such iVd computation with so = 
soAsdo, there is a corresponding tv computation with Si = SiAsdi for all i. 

The first part follows from initialization and completeness, and the second by 
soundness and invariance. The corollary on equivalent input/output behaviour 
requires a modification. 

Corollary 2. If every sq € sq equals soAsod for some sod, then IO{tv,so) = 
{(soZisod, sZ\sd) I soAsod e sq and {{{po,so),sod),{{p,s),Sd)) € IO{TVd,sod)} 



3.2 Correctness of Code in Driven Flow Charts 

We now redefine driven flow charts to allow different code in Fd than in F. 
Commands labeling edges of Fd will be given subscript d. Their semantic function 
is: 

Cdl-j ■ Commandd (5'd Sd) 

The following rather technical definition can be intuitively understood as saying 
c c 

that for each paired p ^ p' and (p, s) {p',s'), the diagram corresponding to 
equation 

ClC}{sAsd) = s'A{CdlCd}sd) 

commutes, provided that various of its subexpressions are defined. 



Sd 

sA . 

S' 



CdlCdj 






Sd 

s' A _ 
S 



Definition 10. Given flow chart F = (P, E,po) and sq G S, Fd = {Pd, Ed, (po. 
So)) is an So-driven form of E if Pd G P x P{S) and Ed satisfies the following 
conditions. 
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1. For each (p,s) ^ {p' ,s') G Ej, there exists p p' E E such that s = sAsd 

and s' = CJCJs are defined if and only if s'^ = Cd\Cd\sd and s' = s'As'j^ are 
defined soundness 

2. If p ^ p' , {p,s) G Pd, and both s = sAsd and s' = CJCJs are defined, then 

Ed has an edge {p,s) ^ (p^s') such that s' = s' A{CdlCd}sd) completeness 
C(d c 

3. (p, s) ^ {p',s'), p ^ p' , and both s = sAsd and s' = CJCJs are defined 

imply Cd\Cd\sd G invariance of s Es. 



Theorem 2. If Ed is an SQ-driven form of E, then is an SQ-driven form of 

TV^ . 

Proof. This is easily verified from Definitions 5 and 10, as the latter is entirely 
parallel to Definition 9. 



3.3 Partial Evaluation by Projections 

Suppose there is a way to decompose or factor a store s into static and dynamic 
parts without loss of information (a basic idea in [18,19]). A data division is a 
triple of functions {stat : S ^ Sg, dyn : S Sd,pair : Sg X Sd ^ S). The ability 
to decompose and recompose without information loss can be expressed by three 
equations: 

pair{stat{s),dyn{s)) = s 
stat{pair{vg,Vd)) = Vg 
dyn{pair{vg,Vd)) = Vd 



An Example. For example, a division could be given (as in [18,19]) by an S — D 
vector, for instance SDD specifies the division of S' = J\f^ into N x 7V^ where 
pair{n, {x, a)) = (n, x, a), stat{n, x, a) = n, and dyn{n, x, a) = {x, a). Using this, 
the program 



f{n,x) =g{n,x,l) 

g{n, x,a) = if n = 0 then a else g(n — l,x,x *a) 
can be specialized with respect to known n = 2 to yield: 

f2{x) =g2{x,l) 

g 2 {x,a) = gi{x,x* a) 
gi{x,a) = go{x,x* a) 
ffo(x,a) = 1 

which by transition compression can be further reduced to 



f 2 {x) = X * x 
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Relationship between Driving and Projections. This method can be inter- 
preted in current terms as specialization by using store sets that are equivalence 
classes with respect to static projections, i.e. every store set is of the following 
form for some Vg E Sg- 



— {■5 I stat{s) = Vg} 

Store reconstruction can be expressed by defining: Sy^Avii = pair{Vg,Vd)- A 
specialized program in [18,19] only contains transitions of form 

((p, stat{s)),dyn{s)) {{p' ,stat{s')),dyn{s')) 

where tt contains (p, s) ^ (p', s'). This corresponds to our soundness condition. 
The set “poly” in [18,19]) is constructed so if (po,so) (p,s) by tt for some 
So € So, then poly and so TVd contains a specialized program point (p, stat(s)), 
ensuring completeness. Invariance of s G s is immediate since every specialized 
state is of the form ((p, s„^), Ud), and 

Sy^Avd = pair{vg,Vd) E {s \ stat{s) = uj 

since stat{pair{Vg,Vd)) = Vg. The following definition is central in [18,19]: 

Definition 11. Function stat : S ^ Sd is congruent if for any tt transitions 
(p, s) ^ {p',s') and (p, si) ^ (p',sj), if stat{s) = stat{si), then stat(s') = 
stat{s[). 

This is essentially the “no new tests” requirement of Definition 7. 

4 An Algorithm for Driving 

The driving algorithm of Figure 3 manipulates store descriptions a E A, rather 
than store sets. For the x" example above, U is the set of all store descriptions 
a of the form 



a = [n H- > u, a; H- > T, a H- > T] 

where u E Af. We assume given a concretization function 7 : A ^ ^(<5') defining 
their meanings, and that the test “is ja = {}?” is computable, i.e. that we can 
recognize a description of the empty set of stores. 

In addition we assume given a store set update function 

S : Command x A ^ A 

and a code generation function 



Q : Command x A’ ^ Commandd 
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read F = {P,E,po); 

read a o', 

Pending := {(po,(ro)}; 


(* Unprocessed program points *) 


SeenBefore := {}; 


(* Already processed pgm. points *) 


Pd ■■= {(po,(To)}; 


(* Initial program points *) 


Ed ■■= {}; 


(* Initial edge set *) 


while 3(p, cr) G Pending do 


(* Choose an unprocessed point *) 


Pending := Pending \ {(p,a)}; 
SeenBefore := SeenBefore U {(p, o")}; 

forall p ^ p' CE E do 


(* Scan all transitions from p *) 


a' :=S(a,C); 


(* Update store set description *) 


if ja' ^ {} then 


(* Generate code if nontrivial *) 


Fd:=PdU{(p',a')}; 

if {p' , a') ^ SeenBefore then add (p', 


a') to Pending; 


Cd:=g(<T,C); 


(* Generate code *) 


Add edge (p,a) (p',a') to 


(* Extend flow chart by one edge *) 


Ed '.= {Pd, Ed, {po, (Jo))', 





Fig. 3. An algorithm for driving 



Correctness Criterion. For any C G Command, a G G Sd, let a' = 

S{C,a) and Cj, = Q{C,o). Definition 10 requires 

ClC1{^iaAsd) = {^f(7')A(CdlCd\sd) 

under certain conditions (where t = t' means both are defined and the values 
are equal): 



1. s = {'ja)Asd and = CdlCd^Sd imply CJCJs = {’ja')As'j^ soundness 

2. s' = C[C]]s and s = {^a)Asd imply s' = {'ja')A{CdlCd}sd) completeness 

3. s = {'ja)Asd G ycr implies CdlCd}sd G ycr' invarianee of s G s. 



4.1 Example: Pattern Matching in Strings 



A way to test a program transformation method’s power is to see whether it can 
derive certain well-known efficient programs from equivalent naive and inefficient 
programs. One of the most popular of such tests is to generate, from a naive 
pattern matcher and a fixed pattern, an efficient pattern matcher as produced 
by the Knuth-Morris-Pratt algorithm. We shall call this the KMP test [27]. 

First we give a program for string pattern matching. 
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match p s = loop p s p s 

loop [] ss op os = True 

loop {p : pp) [] op os = False 

loop {p : pp) (s : ss) op os = if p = s then loop pp ss op os else next op os 

next op [] = False 

next op (s : ss) ~ loop op ss op ss 

For conciseness in exposition, we specify the store sets that are encountered while 
driving match AAB u by means of terms containing free variables. These are 
assumed to range over all possible data values. Given this, the result of driving 
can be described by the configuration graph seen in the Figure ending this paper 
(where some intermediate configurations have been left out). More details can 
be seen in [27]. 

The program generated is: 



fu 


= Jaab u 


Jaab [] 


= False 


Iaab (s : ss) 


= g s ss 


g s ss 


= if H = s then Jab ss else Jaab 


Jab [] 


= False 


Jab (s : ss) 


= h s ss 


hs ss 


= if H = s then fs ss else g ss 


JbW 


= False 


Jb (s : ss) 


= if H = s then g s ss else 
if i? = s then true else h s ss 



This is in essence a KMP pattern matcher, so driving passes the KMP test. It 
is interesting to note that driving has transformed a program running in time 
0{m ■ n) into one running in time 0{n), where m is the length of the pattern 
and n is the length of the subject string. 

Using configurations as above can result in some redundant tests, because we 
only propagate positive information (what term describes the negative outcome 
of a test?). However this problem can easily be overcome by using both positive 
and negative environments, see [15]. 

Partial evaluators of which we know (other than the supecompiler) cannot 
achieve this effect without nontrivial human rewriting of the matching program. 
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match AAB u 



(fAAs) loop AAB u AAB u ^ False 

\ 

{^) A = s loop AB ss AAB (s : ss) — ^ next AAB (s : ss) {{aab ) 

I □ next AAB (s : ss) 



(fAs) loop AB ss AAB A ■. ss 

I 

(h) A = s loop B ss AAB (Al : s : ss) 

I □ next AAB (Al : s : ss) 



False 

next AAB (Al : s : ss) 

loop AAB (s : ss) AAB (s : ss) (g) 



(fs) loop B ss AAB A ■. A : ss 

I 

B = s ^ loop [] ss AAB [A : A : s ■. ss) 
□ next AAB {A ■. A : s : ss) 

loop [] ss AAB A : A : B : ss 
True 



False 



next AAB (Al : Al : s : ss) 
loop XaB (aI : s : ss) AlAlS (Al : s : ss) 

Al = Al ^ loop AB (s : ss) AIaIB (aI : s : ss) 
□ next AAB (Al : s : ss) 

loop AB (s : ss) AIaIS (aI : s : ss) (h) 



4.2 Finiteness and Generalization 

U is usually an infinite set, causing the risk of generating infinitely many dif- 
ferent configurations while driving. Turchin uses the term generalization for the 
problem of choosing configurations to yield both finiteness and efficiency [33,35]. 

The idea is to choose elements a' = S (cj, G) which are “large enough” to 
ensure finiteness of the transformed program, but are still small enough to allow 
significant optimizations. This may require one to ignore some information that 
is available at transformation time, i.e. to choose descriptions of larger and so 
less precise store sets than would be possible on the basis of the current a and C. 

How to achieve termination without overgeneralization is not yet fully under- 
stood. Turchin advocates an online technique, using the computational history 
of the driving process to guide the choices of new a' [35]. It is as yet unclear 
whether self-application for practical compiler generation can be achieved in this 
way, or whether some form of preprocessing will be needed. If offline preprocess- 
ing is needed, it will certainly be rather different from “binding-time analysis” 
as used in partial evaluation [19]. 
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Abstract. Existing partial evaluators usually fix the strategy for bind- 
ing-time analysis. But a single strategy cannot fulfill all goals without 
leading to compromises regarding precision, termination, and code ex- 
plosion in partial evaluators. Our goal is to improve the usability of 
partial evaluator systems by developing an adaptive approach that can 
accommodate a variety of different strategies ranging from maximally 
polyvariant to entirely uniform analysis, and thereby make offline spe- 
cialization more practical in a realistic setting. The core of the analysis 
has been implemented in FSpec, an offline partial evaluator for a subset 
of Fortran 77. 



1 Introduction 

Partial evaluation of imperative programs was pioneered by Ershov and his 
group [13,7]; later Jones et al. [21] introduced binding-time analysis (BTA) to 
achieve self-application of a partial evaluator. This offline approach to partial 
evaluation has been studied intensively since then. 

However, not much attention has been paid to the properties of the binding- 
time analysis in offline partial evaluation (notable exceptions are [11,23,8,6]). 
This is surprising because the annotations a BTA produces, guide the specializa- 
tion process of an offline partial evaluator and, thus, control the quality of the 
program transformation. The choice of the annotation strategy is therefore the 
most decisive factor in the design of an offline partial evaluator. 

Existing offline partial evaluators Jfa; a particular binding-time strategy (e.g., 
[3,9,12,22]). None of them allow the partial evaluator to function with differ- 
ent levels of precision, and all systems implement different strategies based on 
decisions taken on pragmatic grounds. The growing importance of non-trivial 
applications with varying specialization goals (e.g. interpreter specialization vs. 
software maintenance) motivated us to examine a more flexible approach to 
binding-time analysis for imperative languages. Our goal is to improve the us- 
ability of partial evaluation systems by developing an analysis framework that 
allows an easy adaptation and control of different binding-time strategies within 
the same specialization system. 
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Program 


Source code 


Res. code (unif. BTA) 


Res. code (poly. BTA) 






1 




lA 




IB 


Monitor 


10 


IF Upd=TRUE THEN 


10 


Val: =100; 


10: 


OUTPUT 5; 




11 


Val:=CurVal; 


11 


0utVal:=f (Val) ; 






Upd = FALSE 


12 


ENDIF; 


12 


OUTPUT Out Val; 






Val = 100 


13 


0utVal;=f (Val) ; 










DutVal = 0 


14 


OUTPUT Out Val; 










CurVal is dynamic 


















2 




2A 




2B 


Affine 


10 


IF a>0 THEN 


10 


b:=5; 


10: 


pKx) ; 




11 


pCx) ; 


11 


p(x) ; 


11: 


p2(x) ; 


a = 2 


12 


GOTO 10; 


12 


p(x) ; 






b = 5 


13 


ENDIF; 






100 


PROCEDURE pl(y): 


X is dynamic 






100: PROCEDURE p(y) : 


101 


b:=5+y; 


count is dynamic 


100: PROCEDURE p(y) : 


101: b:=b+y; 


102 


count : =count+l ; 




101: a:=a-l; 


102: count : =count+l ; 


103 


RETURN; 




102: b:=b+y; 


103: RETURN; 


104 


PROCEDURE p2(y): 




103: count : =count+l ; 






105 


+ 

II 




104: RETURN; 






106 


count : =count+l ; 












107 


RETURN; 



Fig. 1. Problem source: One BTA is not best for all source programs 



In this paper we examine the design space of binding-time strategies and 
develop a framework to formalize different strategies that allows a partial eval- 
uator to function with different levels of granularity. We claim that it is expressive 
enough to cover all existing strategies and allows the design and comparison of 
new strategies. The core of the analysis engine is implemented for FSpec, an 
offline partial evaluator for a subset of Fortran 77 [22]. We assume familiarity 
with the basic notions of offline partial evaluation, e.g. [19, Part II]. 



2 Problem Source: One Size Does Not Fit All 

In existing partial evaluators, the strategy of the binding-time analysis (BTA), 
and thus its precision, is fixed at design-time; in essence assuming ‘One Size Fits 
Air. The most popular strategy for BTA, due to its conceptual simplicity, is 
to annotate programs using uniform divisions [19]. In this case one division is 
valid for all program points. A polyvariant BTA allows each program point to 
be annotated with one or more divisions. 

Figure 1 shows two pieces of source programs and for each the result of two 
different specializations: One directed by a uniform BTA (column A) and one 
directed by a poly variant BTA (column B). We assume poly variant program point 
specialization [7,19] (a program point in the source program may be specialized 
wrt. different static stores). Program Monitor updates variable Val depending 
on the value of flag Upd (we assume that function f has no side effects and that 
f (100) = 5). Program Affine repeatedly calls procedure p. Variables a, b and 
count are global. 

For Monitor, the polyvariant BTA (IB) clearly achieves the best special- 
ization because result 5 is computed at specialization time. The uniform BTA 
(lA) must consider Val dynamic and can therefore not allow the call of f to 
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be computed at specialization time. For Affine, the uniform BTA (2A) seems to 
provide a better specialization. The polyvariant BTA (2B) recognizes that the 
value of b is sometimes static (in the first round of the loop) and creates an extra 
instance of procedure p. This leads to undesirable duplication of code (which is 
more dramatic for larger programs). Almost all existing partial evaluators, such 
as C-Mix [3] and FSpec [22], give (1A,2A); Tempo [12] gives (1A,2B). 

To conclude, the uniform BTA is preferable for Affine and the polyvariant 
BTA is preferable for Monitor. A partial evaluator that is confined to one of the 
two strategies, A or B, may not be suitable for the task at hand. In such a case the 
user has to resort to rewriting the source program to influence the specialization. 
This is why we are looking for a more elegant and flexible solution to BTA. 

3 Binding-Time Analysis and Maximal Polyvariance 

First, we give a quite abstract definition of a programming language as a state 
transition system. Then, we give a formalization of binding-time analyses and 
define maximal polyvariance. 

3.1 Preliminary Definitions 

We consider only first-order deterministic programming languages, and assume 
that any program has a set of program points. Examples include labels in a 
flow chart language and function names in a functional language. Their essential 
characteristics is that computation proceeds sequentially from program point 
to program point by execution of a series of commands, each of which updates 
a program state. These states are usually described by a pair consisting of a 
program point and a store. The meaning of each command is then a state trans- 
formation computing the effect of the command on a state. We assume a small 
steps semantics (i.e., the execution of each command terminates). 

Definition 1. A programming language is a tuple L = [V,C,S, |-]), where |-] : 
C ^ S ^ V X S is a partial function. Terminology: V is the set of program 
points, C is the set of commands, S is the set of stores, and |-] is the semantics 
of L. A state is a pair fp,<j) eV x S. 

Definition 2. Let L be a programming language, then an L-program is a partial 
mapping P : V ^ C, where V is the set of program points of L and C is the 
set of commands of L. We assume each L-program P has the property that 
Vo G S.^p e dom{P) : |P(p)]u = {p',a') implies p' G dom{P), if defined. 
Notation: The initial program point of a program P is denoted by po ■ 

Definition 3. Let P be an L-program, define computation step as transition 
relation {V,S) x (V,S) such that (p,(j) (p^c^■') iff lP{p)}cr = {p' , cf') is 

defined. A computation (from ao E S) is a finite or infinite sequence 
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From now on we look at programming languages where the store is modelled by a 
finite function a = \x\ ^ v\, . . . ,Xn >■ fn] which maps variables x G X to values 
V G V. We assume there are only finitely many variables in any given program. 
Notation a(xi) denotes value Vi in a. More complicated store models exist and 
can be handled in our framework (e.g., including locations for modelling pointers 
and aliasing), but are omitted for simplicity. 

3.2 Abstract Formulation of Binding-Time Analysis 

The main feature of offline partial evaluation [19] is that program specialization 
proceeds in two steps: a binding-time analysis (BTA) followed by a specializa- 
tion phase. First, the source program is analyzed over a domain consisting of 
two abstract values, S and D, where S (static) represents a value known at 
specialization time, D (dynamic) represents a value that may be unknown at 
specialization time (such a classification of the variables is often called a divi- 
sion). Second, the source program is specialized wrt. known values following the 
static/dynamic annotations made by the BTA. 

The BTA associates with each program point one or more binding-time stores 
where each binding-time store maps variables to binding-time values. We limit 
ourselves to a finite description of binding-time values. 

Definition 4. A binding-time value is a value b G B where B = {S',!)}. A 
binding-time store j3 ■. X ^ B maps variables to binding-time values. A binding- 
time semantics I'lfeta ■ C ^ {X ^ B) ^ {X ^ B) maps a command and a 
binding-time store to a binding-time store. A binding-time state is a pair {p,fd), 
where p is a program point and j3 is a binding-time store. Notation cr\p:s denotes 
a store restricted to variables mapped to S in (3. 

Defining binding-time stores as a map from variables to binding-time values 
does not exclude data structures, such as arrays or records, where the size of 
the structure is fixed at compile time. For example, fields of a record can be 
treated as an individual variables. Often a single variable is used to represent 
the binding-time value of the whole array. 

Definition 5. Let P be an L-program, define binding-time step as transition 
relation {V,B) x {V,B) such that (p,/3) ^ (p',/?') iff 

lP{p)jbtaf3 = f3' A 3cr,cr'.(p,cr) ^ (p',cr') 

We expect I'lbta to be a realization of the congruence rules of language L [19]. 
Given lP{p)Jbtafd = P' , we expect that for any transition (p, cr) ^ (p',o'), the 
values of the variables classified as S in ff must be computable from the values 
of the variables classified as S in /?. This congruence requirement is captured 
more formally by the following definition. 

Definition 6. A binding-time semantics I'lfeta is congruent iff for every program 

P, any variable x G X, any transition {p^ff) ^ (p',/?'), and any two stores a, a' 
such that lP{p)}cr = (p',(Ti) and |F(p)]o'' = (p',a 2 ) we have 

ffp:S= /\/3' {x) = S ^ ai{x) = a 2 (x) 
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3.3 Maximally Polyvariant Binding- Time States 

The task of a BTA is, given an L-program P and a bt-state (po,/?o) of P, to 
compute a set of bt-states (denoted by Ann). This set is always finite because 
a program has finitely many variables and there are finitely many bt-values. To 
keep our discussion language- independent, we shall clearly separate the set of 
bt-states from the syntactic annotation of a source program. 

We wish to specify a soundness condition for an annotation, intuitively stat- 
ing that a specializer should be able to partially evaluate the source program 
using the annotation. The definition must thus be relative to the specializer. We 
let the properties of this specialiser be reflected by a corresponding binding-time 
semantics (which thus represents the needs of the specializer). 

Definition 7. A set of bt-states is ealled an annotation. An annotation, Ann, 
is sound iff the initial ht-state (pojA)) C Ann and for all (pj,l3j) E Ann we have 

— There is a (pi,l3i) € Ann and a bt- store /?'• sueh that {pi,l3i) ^ (Pj,/?j) and 

fJ'r\D)Cfjv\D). 

— For all pk E {p E P \ 3a, a' : (pj,a) -a {p,a')} there is a {pk,f3k) C Ann and 

a binding-time store sueh that {pj,Pj) ^ {pk,P'k) ^''^d {D) 

Definition 8. Let P be an L-program and let po be an initial bt-store, then 
polymax{P, po) denotes the set of bt-states defined by 



polymax{P, po) {{p,P) \ {po,Po) {p,P)} 

Clearly, this set is sound. We call it the maximally polyvariant annotation. 

We have not discussed how to model procedures and calls. This is possible 
but requires non-trivial extensions of the store model (e.g. locations) which we 
shall not describe here. 



4 Dimensions of Binding-Time Analysis 

Programs can be annotated in many ways. A binding-time strategy for realistic 
applications has to accommodate three important, but — unfortunately — often 
conflicting transformation goals: 

1. Increasing staticness by more precise analysis. 

2. Taming code explosion by reducing the amount of polyvariance at spe- 
cialization time. 

3. Ensuring termination of the specialization process by dynamizing opera- 
tions that lead to infinite transformations. 
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n : 1 1:1 1 : m 



Fig. 2. Granularity of binding-time analysis 



A uniform BTA computes one division that is valid for all program points 
(illustrated in Fig. 2). For small programs this assumption is reasonable, but 
not for larger applications because of the non-locality of binding-time effects, 
a problem with flow-insensitive analyses known from the design of optimizing 
compilers. For example, a uniform BTA carries the dynamization of a variable 
in one region to all other regions of a program, even though the variable may 
serve locally distinct purposes in each region. 

Pointwise and polyvariant analyses are flow-sensitive. They allow each pro- 
gram point to be annotated with one or more local divisions (Fig. 2). This 
can significantly improve staticness in programs and avoid the need for man- 
ual binding-time improvements. For example, the BTA of Tempo [17] computes 
pointwise divisions for basic blocks and polyvariant divisions on the procedure 
level. 

Increased staticness in a program does not always come for free. Non-termina- 
tion of the specialization process and code explosion of the generated programs 
are some of the risks one faces. In particular, static values that vary without 
bound, lead to infinite specialization (for each static store encountered at a 
program point, a specialized version is produced by the specialization phase). 
Termination can be ensured by dynamizing such static variables. Strategies for 
ensuring termination without being overly conservative are a topic of current 
research [5,14]. 

5 Strategy Language 

Informally, a BTA strategy is a guiding principle for annotation. Known strate- 
gies include uniform analysis and pointwise analysis. Our aim is to specify a 
high-level ‘strategy language’ which may be used to control a binding-time anal- 
ysis. The ambition is that the language be simple while offering a large design 
space allowing to compare the relative strength of different BTA strategies. 

Formally, we define a strategy to be a criterion for being well-formed (wrt. 
the strategy). For instance, an annotation is well- formed wrt. the uniform BTA 
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^uniform 

^pointwise 

^polymax 



= I3'{x) = D 

— ^uniform A p — J) 

= False 



Fig. 3. Three well-known BTA strategies 



Source code 


Annotations 


10; 


IF Upd=TRUE THEN 


(S,S,D,S) 


11; 


Val ; =CurVal ; 


(S,S,D,S) 


12; 


ENDIF; 


(S,D,D,S) 


13; 


OutVal ; =f (Val) ; 


(S,S,D,S) (S,D,D,S) 


14; 


OUTPUT OutVal; 


(S,S,D,S) (S,D,D,S) 



Fig. 4. Polyvariant annotation of Monitor: (Upd. Val. CurVal, OutVal) 

strategy if and only if every variable has the same annotation in all bt-stores in 
the annotation. In this paper, all strategies are of the form 

Vx G T.V(p, /?), (p^ • 

S{x,p,p'/i3/i3') => i3(x) = D 

where the predicate S can take many forms. We will identify a strategy with the 
predicate S that defines it. We implicitly assume that all annotations be sound. 
For convenience, we omit the parameters of a predicate, as in the definitions 
in Fig. 3. Regard the definition of Suniform- This predicate defines a strategy 
that allows only one annotation for each variable in the source program. The 
predicate is so simple that it does not need to refer to p or p' . 

To see what this strategy means, consider the Monitor program. A poly- 
variant annotation is given in Fig. 4. This annotation is not well-formed wrt. 
Suniform since Val has more than one annotation. More formally, choosing 

x = Val; {p,fJ) = {13, {S,S,D,S)y, {p' , fJ') = {13, {S, D, D, S)) 

we evidently get a counterexample to Suniform- We say that x and {p,j3) form 
a violation of the strategy. Of course, if an annotation is not well- formed wrt. 
some strategy S, a violation of S must exist. 

A natural annotation that does satisfy the uniformity constraint is the set 
{{p, {S, D, D, S)) \ p E {10, 11, 12, 13, 14}}, which is also the one that we would 
expect as output of a uniform BTA. Note, however, that classifying all variables 
dynamic at all program points is an annotation that is also (trivially) well-formed 
wrt. Suniform- This annotation will be well-formed wrt. any strategy. 

Another example of a well-known strategy is Spointwise which is also defined 
in Fig. 3. It is obtained by applying the uniform strategy to individual pro- 
gram points, merging bt-stores only if different ones occur at a single point in 
the program. This strategy forces a monovariant (but not necessarily uniform) 
annotation of all variables. Finally, as we have implicitly required all annota- 
tions to be sound, we get a maximally poly variant strategy by adding no further 
requirements. 
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BTA Classification Formulating the BTA strategies in our language allows us to 
study their relative strengths formally. Recall the overall classification of BTA 
strategies given in Figure 2. Most systems (e.g. C-Mix [3] and FSpec [22]) use 
some BTA strategy, S, which is no stronger than the pure, uniform strategy in 
the sense that S Suniform, he. these systems detect no more staticness than 
one does with the pure, uniform strategy. For several systems biimplication does 
not hold because some variables are generalized in order to ensure termination 
of the specializer. 

The Tempo system has a non-uniform BTA strategy, STempo, that allows a 
limited amount of polyvariance^. Among Suniform and STempo neither left- or 
right-implication holds; Tempo is non-uniform, but it forces generalization of 
static variables defined under dynamic control. 

The strategy Spoiymax is trivially stronger than all other strategies. Note that 
what we compare is precision, which is not a universal measure of quality. We 
argue that for some source programs the user needs a precise analysis, for other 
source programs a less precise analysis is better. 



6 Simple Construction of Well-Formed Annotations 

In this section, we take a small detour to sketch one way of implementing a BTA 
algorithm that is able to realize any strategy as described above. The algorithm 
builds upon a maximally polyvariant BTA as defined in Section 3. The authors 
have implemented a maximally polyvariant PolyMax function [10] for a non- 
trivial subset of Fortran — the subset of the FSpec partial evaluator [22]. 

We assume to have a command dyup^x in the source language, for which 

\dynp^x\(7 = (cr,p) 

IdyUp^xiB'YAp = !3[x D] 

Also, we assume that for any program P we have dom{P) so that we may 
choose a new program point Pnew G P \ dom{P). 

Our algorithm can be seen in Figure 5. The basic idea is to start out with a 
maximal annotation, then remove strategy violations one at a time until none are 
left. Violations are removed by inserting dyn commands in the source program 
where generalization is needed. A maximally polyvariant BTA should be used in 
step 1 to get maximal preciseness within the constraints defined by the strategy. 

Termination of the algorithm is guaranteed by the fact that Ann will be 
strictly increasing its dynamicity during each iteration. Well-formedness (wrt. 
the input strategy) of the final result should be evident. We do not claim this 
algorithm to be efficient, it merely serves to illustrate how a strategy chosen at 
specialization-time can be used to control the result of partial evaluation. 

^ In Tempo, the annotated program may contain several copies of each function - one 
for each bt-store it is called with. But within one copy, a statement can have only 
one annotation. 
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Given source program, P, and a strategy, S: 

1. Compute ^nn:=PolyMax(P). 

2. If Ann is well-formed wrt. S then stop, outputting Ann; else goto 3. 

3. Pick X E X and (pvicP) € Ann that form a violation of S. 

4. Choose a new program point Pnew, set 

P := P[pvio 1 -^ dynp„^^,x;Pnew P{Pvio)] 

5. Goto 1. 



Fig. 5. Algorithm implementing BTA parameterized by strategy 

7 An Example Strategy 



To illustrate our method, we show a new strategy that can be modeled in our 
framework. It is characterized by separate treatment of different language con- 
structs, e.g. conditionals, loops and procedures. 

The idea is to minimize code explosion in the residual program while being 
robust wrt. procedure inlining^, a feature that is not currently achieved by any 
system implementing polyvariant procedure calls for an imperative language. 
We also wish to allow polyvariance elsewhere as long as it can only lead to code 
explosion in the annotated program - not the residual program. Towards this 
end, we decree that loop entry points may only be annotated polyvariantly if the 
test-expression (i.e. the loop condition) is static, in which case only one branch 
will be chosen by the specializer (leaving the other branch as dead code in the 
annotated program). We denote by Vioopentry the set of program points that 
constitute loop entries. The new strategy is defined by 



^example — P E loopentry A — D A Spointwise 



Here, the term j3{test{p)) is a shorthand for stating that the test expression of 
the loop starting at p is dynamic in /?. The above strategy will not always prevent 
code explosion, and it does not guarantee termination of the specialization phase. 
However, it demonstrates that reasonable heuristics can be simple to phrase. 

An example where this strategy turns out to be useful is shown in Fig. 6. 
The source program is a fragment of an interpreter for a Fortran-like language 
with one local and one global scope. Beside the input expression, the position of 
the global scope in the store is also statically known. However, the store itself 
and the position of the local scope in the store are dynamic. 

The reader may convince himself that a uniform BTA will not achieve sat- 
isfactory specialization in this example. As demonstrated [8] in a similar case, 
the return value of eval will be considered dynamic, disallowing full evaluation 
of 2 + 3. On the other hand, using a maximally polyvariant BTA, we run into a 
different problem. In the WHILE-loop of procedure lookup, there is a possibility 

That is, treating both procedure entry and exit fully polyvariantly. 



2 
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Source code 


Residual code of eval( (2+3)+x) 


10 


PROCEDURE eval(E); 


(* E = (2+3)+x *) 


11 


CASE E.op: 


(* GlobS =0 *) 


12 


’ cnst : RETURN E . val ; 


(* LocS and St are dynamic. *) 


13 


’var : RETURN lookup (E. id) ; 






14 


’+ : RETURN eval(E.Lexp)+ 








eval(E.Rexp) ; 


100 


COMMON LocS, St; 






101 


INTEGER Cur; 


20 


PROCEDURE lookup (Id) : 


102 


Cur : =LocS ; 


21 


COMMON GlobS, LocS, St; 


103 


WHILE (StfCur] .id/’ x) DO 


22 


INTEGER Cur; 


104 


IF (St [Cur] =’ end) 


23 


Cur : =LocS ; 


105 


THEN Cur:=0; 


24 


WHILE (StfCur] .id/Id) DO 


106 


ELSE Cur : =Cur + 1 ; 


25 


IF (St [Cur] = ’ end) 


107 


ENDWHILE 


26 


THEN Cur:=GlobS; 


108 


RETURN 5+St[Cur] .val; 


27 


ELSE Cur : =Cur+ 1 ; 






28 


ENDWHILE 






29 


RETURN St [Cur], val; 






30 


END; 







Fig. 6. Specialization of an interpreter fragment using the example strategy 



of variable Cur turning static (by assigning to it the value of GlobS). This pos- 
sibility will be explored by the specializer. However, since Cur increases under 
dynamic control, specialization will run into an infinite loop. 

Now consider our example strategy. Because of the poly variant procedure an- 
notation, (2 + 3) can be completely evaluated. Since the (dynamically controlled) 
WHILE-loop must be annotated monovariantly, Cur will always be considered dy- 
namic and we avoid infinite specialization. Thus, we avoid both problems and 
obtain useful residual code. 

8 Related Work 

Binding-time analysis for partial evaluation was first developed in the con- 
text of functional languages [20] and was later carried over to imperative lan- 
guages [16,1]. Analyses for imperative languages are usually more complex due to 
the different storage model, side-effects, aliasing of variables and pointer manipu- 
lations [2,17]. Binding-time analyses for object-oriented languages are still under 
development. A multi-level binding-time analysis [15] analyses source programs 
over an abstract domain representing two or more stages of computation. 

Regardless of the source language or the type of analysis, all existing offline 
partial evaluators fix one particular binding-time strategy for program special- 
ization (e.g., [3,9,12,22]). The most popular strategy, due to its conceptual sim- 
plicity, it to annotate programs using a uniform binding-time analysis [19]. The 
use of a flow-sensitive binding-time analysis for partial evaluation was pioneered 
in Tempo [17,18]. 
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Few attempts have been made to examine the impact of different annotation 
strategies on the quality of the residual programs and the specialization process. 
Notable exceptions are [11,6] who developed a poly variant BTA for a higher- 
order applicative language, and [23] who implemented a polyvariant BTA for 
the Similix partial evaluator. An alternative approach was suggested in [8] where 
polyvariance is achieved by instrumenting programs with explicit bt-values and 
performing partial evaluation in two passes; [24] used the interpretive approach 
to the same effect. These works deal with higher-order functional languages. 

Strategies for guaranteeing termination of the specialization process without 
being overly conservative are a topics of current research [5,14]. These strategies 
ensure termination by controlling the degree of polyvariance at specialization 
time by dynamizing appropriate variables. A speed-up analysis that predicts the 
relative speedup of residual programs obtained using a uniform annotation was 
studied in [4]. 

9 Conclusion 

Our goal was to develop the foundations for an adaptive approach to binding- 
time analysis which is flexible and powerful enough to study the impact of 
binding-time strategies in a realistic context. We advocate that partial evalu- 
ation systems be built that allow flexibility in the BTA instead of hard-coding a 
single strategy on pragmatic grounds. We showed that different BTA strategies 
drastically influence of the quality of generated programs. The strategy language 
we developed allows us to catalog and design different BTA strategies. 
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Abstract. We present an approach to software verification by program 
inversion, exploiting recent progress in the field of automatic program 
transformation, partial deduction and abstract interpretation. Abstrac- 
tion-based partial deduction can work on infinite state spaces and pro- 
duce finite representations of infinite solution sets. We illustrate the po- 
tential of this approach for infinite model checking of safety properties. 



1 Introduction 

Modern computing applications increasingly require software and hardware sys- 
tems that are extremely reliable. Unfortunately, current validation techniques 
are often unable to provide high levels of assurance of correctness either due to 
the size and complexity of these systems, or because of fundamental limitations 
in reasoning about a given system. This paper examines the latter point showing 
that abstraction-based partial deduction can serve as a powerful analytical tool. 
This has several advantages in comparison with, e.g., standard logic program- 
ming. Among others, abstraction-based partial deduction has the ability to form 
recursively defined answers and can be used for program verification. 

We apply the inversion capabilities of abstraction-based partial deduction 
to other languages using interpretive definitions. This means that a wide class 
of different verification tasks can be analyzed in a common framework using a 
set of uniform transformation techniques. We examine the potential for infinite 
model checking, and support our claims by several computer experiments. 

2 Inversion, Partial Deduction, and Interpreters 

Inversion. While direct computation is the calculation of the output of a pro- 
gram for a given input, inverse computation is the calculation of the possible 
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pi-calc 

pi-def 

ab-spec 



Fig. 1. Abstraction-based partial deduction ab-spec applied to Petri-nets, 
TT-calculus, and functional programs via interpretive language definitions 





input of a program for a given output. Consider the familiar append program, 
it ean be run forwards (to concatenate two lists) and backwards (to split a list 
into sub lists). Advances in this direction have been made in logic programming, 
based on solutions emerging from logic and proof theory. 

However, inversion problems are not restricted to logic languages. Reasoning 
about the correctness of, say, a software specification, one may need to verify 
whether and how a critical state can be reached from any earlier state. This 
analysis requires inverse computation. The key idea is this: to show that a given 
system satisfies a given specification — representing a safety property — start with 
the bad states violating the specification, work haekwards and show that no initial 
state leads to such a bad state. 

Abstraction-Based Partial Deduction. The relationship between abstraet 
interpretation and program specialisation has been observed and several formal 
frameworks have been developed [6,10,9]. Abstraction-based partial deduction 
(APD) combines these two approaches and ean thereby solve specialisation and 
analysis tasks which are outside the reach of either method alone [12,11]. It 
was shown that program specialisation combined with abstract interpretation 
can vastly improve the power of both techniques (e.g., going beyond regular 
approximations or set-based analysis) [12]. 

Interpreters. Language-independence can be achieved through the interpretive 
approach [18,7,1]: an interpreter serves as mediator between a (domain-specific) 
language and the language for which the program transformer is defined. Efficient 
implementations can be obtained by removing the interpretive overhead using 
program specialisation (a notable example are the Futamura projections). Work 
on porting inverse computation to new languages includes the inversion of im- 
perative programs by treating their relational semantics as logic programs [15] 
and applying the Universal Resolving Algorithm to interpreters written in a 
functional language [1]. 

Our Approach. The approach we will pursue in this paper is twofold. First, we 
apply the power of APD to inverse computation tasks. Instead of enumerating 
a list of substitutions, as in logic programming, we produee a new logic program 
which can be viewed as model of the original program instantiated to the given 
query. The transformation will (hopefully) derive a much simpler program (sueh 
as p : - f ail) , but APD has also the ability to form recursively defined programs. 
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Second, we use the interpretive approach to achieve language-independence. 
APD is implemented for a logic language, but we can apply its inversion capa- 
bilities to different language paradigms, such as Petri-nets and the rr-calculus, 
via interpreters without having to write tools for each language (see Fig. 1). 

To put these ideas to a trial, we use the ecce logic program specialiser [12,13] — 
employing advanced control techniques such as characteristic trees to guide the 
specialisation process — coupled with an abstract interpretation technique. (A 
more detailed technical account is beyond the scope of this extended abstract; 
the interested reader will find a complete description in [12,13].) This APD- 
system does not yet implement the full power of [12,11], but it will turn out to 
be sufficiently powerful for our purposes. 

3 Advanced Inversion Tasks for Logic Programs 

To illustrate three questions about a software requirement specification relying 
on solving inversion problems, let us consider a familiar example: exponentiation 
of natural numbers (z = x^). 

1. Existenee of solution? Given output state z (e.g. z = 3), does there exist an 
input state x, y with y > 1 that gives raise to z? Answer: state z = 3 can 
never be reached. Observe that here we are not interested in the values of 
X, y, we just want to know whether such values exist. We will call such a 
setting inversion cheeking. 

2. Finiteness of solution? Given output state z (e.g. z = 4), is there a finite 
number of input states x, y that can give raise to z? Answer: only two states 
(a: = 4, y = 1 and x = 2,y = 2) lead to z = 4. 

3. Finite description of infinite solution? Given output state z (e.g. z = 1), 
can an infinite set of input states be described in a finite form? Answer: any 
input state with y = 0 leads to z = 1, regardless of x. 

Example 1. All three questions from above can be answered with APD. Gonsider 
a logic program encoding exponentiation of natural numbers where numbers are 
represented by terms of type r = 0 j s (r) . 

exp (Base ,0, s (0) ) . 

exp (Base, s (Exp) , Res) exp(Base,Exp,BE) ,mul(BE,Base,Res) . 

mul(0,X,0) . 

mul(s(X) ,Y,Z) mul(X,Y,XY) ,plus(XY,Y,Z) . 

plus(0,X,X) . 

plus(s(X) ,Y,s(Z)) plus(X,Y,Z) . 

1. Existenee of solution. Inverting the program for x^ = 3, y > 1, that is by spe- 
cialising exp/2 wrt. goal exp(X, s (s (Y) ) , s (s (s (0) ) ) ) , produces an empty 
program: no solution exists. 

2. Finiteness of solution. Inverting x^ = 4, that is by specialising exp/2 wrt. 
goal exp(X,Y,s(s(s(s(0))))), produces a program in which the two solu- 
tions a; = 4, y = 1 and x = 2,y = 2 are explicit: 
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exp l(s(s(s(s(0)))) ,s(0)) . 

exp l(s(s(0)) ,s(s(0))) . 

3. Finite representation of infinite solution. Finally, inverting = 1 can be 
solved by specialising exp/2 wrt. goal exp(X,Y,s(0) ) . The result is a recur- 
sive program: infinitely many solutions were found and for any x, y) 
and described in a finite way.^ This finite description is possible in our ap- 
proach, but not in conventional logic programming, because APD generates 
(recursive) programs instead of enumerating an (infinite) list of answers. 

exp 1 (XI ,0) . 

exp l(s(0),s(Xl)) exp_conj 2(X1) . 

exp_conj 2(0) . 

exp_conj 2(s(Xl)) exp_conj 3(X1) . 

exp_conj 3(0) . 

exp_conj 3(s(Xl)) exp_conj 3(X1) . 



Example 2. As a more practical application, take the following program which 
allows to determine whether a list has an even number of elements (pair 1/1) 
and to delete from a list an element contained in the list (del/3). 

pairK [] ) . 

pairK [A|X] ) :- oddl(X) . del (X, [X| T] ,T) . 

oddKfAlX]) pairl(X) . del (X, [Y| T] , [Y |DT] ) : - X\=Y,del(X,T,DT) . 

One might want to verify the property that deleting an element from a pair list 
will not result in a pair list. This can be translated into requiring that the fol- 
lowing predicate always fails: error(X,L) pairl(L) ,del(X,L,DL) ,pairl(DL) . 
which can be deduced by our APD-system: error__l(x,L) fail. To con- 

clude, APD can invert programs in ways not possible with other approaches. 

4 Case Study: Inversion and Infinite Model Checking 

Recent years have seen considerable growth [5] in the application of model check- 
ing techniques [4,3] to the validation and verification of correctness properties 
of hardware, and more recently software systems. The method is to model a 
hardware or software system as a finite, labelled transition system (LTS) which 
is then exhaustively explored to decide whether a given specification holds for 
all reachable states. One can even use tabling-based logic programming as an 
efficient means of performing explicit model checking [14]. 

However, many software systems cannot be modelled by a finite LTS (or 
similar system) and, as a consequence, there has been a lot of effort to enable 
infinite model checking (e.g., [17]). We argue that inverse computation in gen- 
eral, and our APD-technique in particular, has a lot to offer for this avenue of 
research: 

^ The residual program can be improved by better post-processing. 
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- The system to be verified can be modelled as a program (possibly by means of 
an interpreter) . This obviously includes finite LTS but also allows to express 
systems with an infinite number of states. 

- Model checking of safety properties then amounts to inversion checking: we 
prove that a specification holds by showing that there exists no trace (the 
input argument) which leads to an invalid state. 

- To be successful, infinite model checking requires refined abstractions (a 
key problem mentioned in [5]). The control of generalisation of APD pro- 
vides just that (at least for the examples we treated so far). In essence, the 
specialisation component of APD performs a symbolic traversal of the state 
space, thereby producing a finite representation of it, on which the abstract 
interpretation performs the verification of the specification. 

Consider the Petri net shown in Fig. 2. It models a single process which may enter 
a critical section (cs), the access to which is controlled by a semaphore (sema). 
The Petri net can be encoded directly as a logic program using an interpreter 
for Petri-nets trace/3, where the object-level Petri net is represented by trans/3 
facts. The trace/3 predicate checks for enabled transitions and fires them. 

The initial marking of trace/3 can be seen in start/3: 1 token in the sema- 
phore (sema), no tokens in the reset counter (c), no processes in the critical 
section (cs) and no processes in the final place (y). There may be X processes in 
the initial place (x). Again, numbers are represented by terms of type r = 0 | 
s(r). More processes can be modelled if we increase the number of tokens in 
the initial place (x). Forward execution of the Petri net: given an initial value 
for X and a sequence of transitions trace determine the marking(s) that can be 
reached. 

Let us now check a safety property of the given Petri net, namely that it is 
impossible to reach a marking where two processes are in their critical section 
at the same time. Clearly, this is an inversion task: given a marking where two 
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processes are in the critical section, try to find a trace that leads to this state. 
More precisely we want to do inversion checking, as the desired outcome is to 
prove that no inverse exists. 

Example 3. Inverting the Petri net by specialising the interpreter in Fig. 2 wrt. 
the query start(Tr,s(0) , [X,S,s(s(CS)) ,Y,C] ) we obtain the following program: 
start(Tr,s(0), [X3,X4,s(s(X5)),X6,X7]) fail. 

This inversion task cannot be solved by PROLOG (or XSB-Prolog [16] with 
tabling), even when adding moding or delay declarations. Due to the counter (c) 
we have to perform infinite model checking which in turn requires abstraetion 
and symbolie execution. Both of these are provided by our APD approach. 



Example 4- Similarly, one can prove the safety property regardless of the num- 
ber of processes, i.e., for any number of tokens in the initial place (x). When we 
specialise the interpreter of Fig. 2 for the query unsafe(X,s(0) ,0,0,0) we get (af- 
ter 2 iterations each of the specialisation and abstract interpretation components 
of ECCE): start (Tr, Processes , [X3 ,X4 , s (s (X5) ) ,X6 , X7] ) fail. 

5 Porting to Other Languages and Paradigms 

We can apply the power of our APD-approach, along with its capabilities for 
inversion and verification [8], to the 7r-calculus by writing an interpreter for it. 
We have also successfully ported inverse computation to a functional language 
via an interpreter (omitted from extended abstract). Apart from highlighting the 
power of our approach, these examples provide further computational evidence 
for the theoretical result [1] that inverse computation can applied to arbitrary 
languages via interpreters. 

6 Conclusion and Assessment 

We presented an approach to program inversion, exploiting progress in the field 
of automatic program transformation, partial deduction and abstract interpre- 
tation. We were able to port these inversion capabilities to other languages via 
interpretive definitions. We examined the potential for infinite model checking 
of safety properties, and supported our claims by computer experiments. We be- 
lieve, by exploiting the connections between software verification and automatic 
program specialisation, one may be able to significantly extend the capabilities 
of analytical tools that inspect the input/output behaviour. 

The emphasis was on novel ways of reasoning rather than efficiency and large 
scale applications. In principle, it is possible to extend our approach to verify 
larger, more complicated infinite systems.^ As with all automatic specialisation 

^ Larger systems have been approached with related techniques as a processing 
phase [9]. However, their purpose is to reduce the state space rather than provide 
novel ways of reasoning. Another approach related to ours is [2]. 
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tools, there are several points that need to be addressed: allow more generous 
unfolding and polyvariance (efficiency, both of the specialisation process and the 
specialised program, are less of an issue in model checking) to enable more precise 
residual programs and implement the full algorithm of [11] which allows for more 
fine grained abstraction and use BDD-like representations whenever possible. 
Currently we can only verify safety properties (i.e., no bad things happen) and 
not liveness properties (i.e., good things will eventually happen). The latter 
might be feasible by a more sophisticated support for the negation. 
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Abstract. The current state of the art for ensuring finite unfolding of 
logic programs consists of a number of online techniques where unfold- 
ing decisions are made at specialisation time. Introduction of a static 
termination analysis phase into a partial deduction algorithm permits 
unfolding decisions to be made offline, before the actual specialisation 
phase itself. This separation improves specialisation time and facilitates 
the automatic construction of compilers and compiler generators. The 
main contribution of this paper is how this separation may be achieved 
in the context of logic programming, while providing non-trivial support 
for partially static datastructures. 

The paper establishes a solid link between the fields of static termination 
analysis and partial deduction enabling existing termination analyses to 
be used to ensure finiteness of the unfolding process. This is the first of- 
fline technique which allows arbitrarily partially instantiated goals to be 
sufficiently unfolded to achieve good specialisation results. Furthermore, 
it is demonstrated that an offline technique such as this one can be imple- 
mented very efficiently and, surprisingly, yield even better specialisation 
than a (pure) online technique. It is also, to our knowledge, the first 
offline approach which passes the KMP test (i.e., obtaining an efficient 
Knuth-Morris-Pratt pattern matcher by specialising a naive one). 



1 Introduction 

Control of partial deduction — a technique for the partial evaluation of pure logic 
programs — is divided into two levels. The local level guides the construction of 
individual SLDNF-trees while the global level manages the forest, determining 
which, and how many trees should be constructed. Each tree gives rise to a spe- 
cialised predicate definition in the final program so the global control ensures a 
finite number of definitions are generated and also controls the amount of poly- 
variance. The local control on the other hand determines what each specialised 
definition will look like. 

Techniques developed to ensure finite unfolding of logic programs [2, 22, 21] 
have been inspired by the various methods used to prove termination of rewrite 
systems [7, 6]. Whilst, by no means ad hoc, there is little direct relation between 
these techniques and those used for proving termination of logic programs (or 
even those of rewrite systems). This means that advances in the static termina- 
tion analysis technology do not directly contribute to improving the control of 
partial deduction. This paper aims to bridge this gap. 



D. Bj0rner, M. Broy, A. Zamulin (Eds.): PSI’99, LNCS 1755, pp. 101—112, 2000. 
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Moreover, the control described in [2, 22, 21] as well as the more recent 
[27, 16, 15] are inherently online, meaning that they are much slower than offline 
approaches and that they are not based on a global analysis of the program’s 
behaviour which enables control decisions to be taken before the actual special- 
isation phase itself. 

Offline approaches to local control of partial deduction on the other hand 
[25, 11, 12, 3] have been very limited in other respects. Specifically, each atom 
in the body of a clause is marked as either reducible or non-reducible. Reducible 
atoms are always unfolded while non-reducible atoms on the other hand are 
never unfolded. Whilst this approach permits goals to be unfolded at normal 
execution speed, it can unduly restrict the amount of unfolding which takes place 
with a detrimental effect on the resulting specialised program. Another problem 
of [25, 12] is that it classifies arguments either as static (known at specialisation 
time) or dynamic (unknown at specialisation time). This division is too coarse, 
however, to allow refined unfolding of goals containing partially instantiated data 
where some parts of the structure are known and others unknown. Such goals 
are very common in logic programming, and the key issue which needs to be 
considered is termination. A partial solution to this problem has been presented 
in [3], but it still sticks with the limited unfolding mentioned above and can 
“only” handle a certain class of partially instantiated data (data bounded wrt 
some semi-linear norm). 

A Sonic Approach. This paper proposes a flexible solution to the local ter- 
mination problem for offline partial deduction of logic programs, encompassing 
the best of both worlds. Based on the cogen approach^ for logic programs [12], 
the construction of a generating extension will be described which “compiles in” 
the local unfolding rule for a program and is capable of constructing maximally 
expanded SLDNF-trees of finite depth. 

The technique builds directly on the work of [23] which describes a method 
for ensuring termination of logic programs with delay. The link here is that the 
residual goals of a deadlocked computation are the leaves of an incomplete SLD- 
tree. The basic idea is to use static analysis to derive relationships between the 
sizes of goals and the depths of derivations. This depth information is incorpo- 
rated in a generating extension and is used to accurately control the unfolding 
process. At specialisation time the sizes of certain goals are computed and the 
maximum depth of subsequent derivations is fixed according to the relationships 
derived by the analysis. In this way, termination is ensured whilst allowing a 
flexible and generous amount of unfolding. Section 3 reviews the work of [23] 
and shows how it can be used directly to provide the basis of a generating ex- 
tension which allows finite unfolding of bounded goals. A simple extension to 
the technique is described in Section 4 which also permits the safe unfolding of 
unbounded goals. 

This is the first offline approach to partial deduction which is able to success- 
fully unfold arbitrarily partially instantiated (i.e. unbounded) goals. In fact, 

^ Instead of trying to achieve a compiler generator (cogen) by self-application [8] one 
writes the cogen directly [26]. 
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it is demonstrated that the method can, surprisingly, yield even better special- 
isation than (pure) online techniques. In particular, some problematic issues in 
unfolding, notably unfolding under a coroutining computation rule and the back 
propogation of instantiations [21], can be easily handled within the approach (for 
further details see [24]). Furthermore, it is the first offline approach which passes 
the KMP test (i.e., obtaining an efficient Knuth-Morris-Pratt pattern matcher 
by specialising a naive one), as demonstrated by the extensive experiments in 
Section 6. 

An analysis which measures the depths of derivations may be termed a sound- 
ing analysis. Section 5 describes how such an analysis can be based on existing 
static termination analyses which compute level mappings and describes how the 
necessary depths may be obtained from these level mappings. Unfolding based 
on a sounding analysis then, is the basis of sonic partial deduction. 

2 Preliminaries 

Familiarity with the basic concepts of logic programming and partial deduction is 
assumed [19, 20]. A level mapping (resp. norm) is a mapping from ground atoms 
(resp. ground terms) to natural numbers. For an atom A and level mapping j.j, 
A| I denotes the set {\A0\ \ A9 is ground}. An atom A is (un)bounded wrt j.j if 
A| I is (in)finite [4]. For this paper, the notion of level mapping is extended to 
non-ground atoms by defining for any atom A, |A| = mm(A|,|); and similarly 
for norms. The norm \t\ien returns the length of the list t. A list t is rigid iff 
\t\ien = \t6\ien for all 9. A clause c ■. H ^ A\, ..., An is recurrent if for every 
grounding substitution d for c, \H9\ > \Ai9\ for alH e [l,n]. 

3 Unfolding Bounded Atoms 

A fundamental problem in adapting techniques from the termination literature 
for use in controlling partial deduction is that the various analyses that have 
been proposed (see [4] for a survey) are designed to prove full termination for a 
given goal and program, in other words guaranteeing finiteness of the complete 
SLDNF-tree constructed for the goal. For example, consider the goal ^ Flatten([x, 
y, z], w) and the program Flatten consisting of the clauses appi, app 2 , fla,t\ and 



fl(lt2 ■ 










Flatten([], []), 








Flatten([e|x], 


0- 


- Append (e, y, r) A Flatten (x, y) 


O-PPl 


Append([], x, 


x). 




0 -PP 2 


Append([u|x], 


y. 1 


ujzj) ^ Append(x, y, z). 



A typical static termination analysis would (correctly) fail to deduce termi- 
nation for this program and goal. Most analyses can infer that a goal of the 
form ^ Flatten(x, y) will terminate if x is a rigid list of rigid lists, or if x is a 
rigid list and y is a rigid list. In the context of partial deduction however, such 
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a condition for termination will usually be too strong. The problem is that the 
information relating to the goal, by the very nature of partial deduction, is of- 
ten incomplete. For example, the goal ^ Flatten([x, y, z], w), will not terminate 
but the program can be partially evaluated to produce the following specialised 
definition of Flatten/2. 

Flatten([x, y, z], r) ^ Append(x, rl, r ) A Append(y, r2, rl) A Append(z, [], r2). 

The scheme described in [23] transforms programs into efficient and termi- 
nating programs. It will for instance transform the non-terminating program 
Flatten into the following efficient, terminating program, by adding an extra 
depth parameter. 

flat Flatten(x, y) ^ SetDepth_F(x, d) A Flatten(x, y, d). 

DELAY Flatten(_, d) UNTIL Ground(d). 
flatl Flatten ([], [], d) ^ d > 0. 

flat^ Flatten([e|x], r, d) ^ d > 0 A Append(e, y, r) A Flatten(x, y, d — 1). 

app* Append(x, y, z) ^ SetDepth_A(x, z, d) A Append(x, y, z, d). 

DELAY Append(_, d) UNTIL Ground(d). 
app\ Append([], x, x, d) ^ d > 0. 

app 2 Append([u|x], y, [u|z], d) ^ d > 0 A Append(x, y, z, d — 1). 

For now, assume that the (meta-level) predicate SetDepth_F(x, d) is defined 
such that it always succeeds instantiating the variable d to the length of the list 
x if this is found to be rigid, (i.e., \x\ien = |xd|ien for every substitution d), and 
leaving d unbound otherwise. Note that a call to Flatten/3 will proceed only if 
its third argument has been instantiated as a result of the call to SetDepth_F(x, 
d). The purpose of this last argument is to ensure finiteness of the subsequent 
computation. More precisely, d is an upper bound on the number of calls to the 
recursive clause flat^ in any successful derivation. Thus by failing any deriva- 
tion where the number of such calls has exceeded this bound (using the test 
d > 0), termination is guaranteed without losing completeness. The predicate 
SetDepthJ\/3 is defined in a similar way, but instantiates d to the minimum of 
the lengths of the lists x and z, delaying if both x and z are unbounded. 

The main result of [23] guarantees that the above program will terminate for 
every goal (in some cases the program will deadlock) . Moreover, given a goal of 
the form ^ Flatten (x, y) where x is a rigid list of rigid lists or where x is a rigid list 
and y is a rigid list, the program does not deadlock and produces all solutions to 
such a goal. In other words, both termination and completeness of the program 
are guaranteed. 

Since the program is terminating for all goals, it can be viewed as a means of 
constructing a finite (possibly incomplete) SLD-tree for any goal. As mentioned 
above, it is indeed capable of complete evaluation but a partial evaluation for 
bounded goals may also be obtained. Quite simply, the deadlocking goals of the 
computation are seen to be the leaf nodes of an incomplete SLD-tree. 
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For example, the goal ^ Flatten([x, y, z], r) leads to deadlock with the residual 
goal ^ Append(x, rl, r, dl) A Append(y, r2, rl, d2) A Append(z, [], r2, d3). Removing 
the depth bounds, this residue can be used to construct a partial evaluation of 
the original goal resulting in the specialised definition of Flatten/2 above. 

The approach, thus far, is limited in that it can only handle bounded goals. 
For unbounded goals the unfolding will deadlock immediately and it is not pos- 
sible, for example, to specialise ^ Flatten([[], [a] | z], r) in a non-trivial way. This 
strong limitation will be overcome in the following sections. 

The method proposed in [1] (and further developed in [21]) ensures the con- 
struction of a finite SLD-tree through the use of a measure function which as- 
sociates with each node (goal) in the tree a weight from a well-founded set. 
Finiteness is ensured by imposing the condition that the weight of any goal is 
strictly less than the weight of its direct covering ancestor. This last notion is 
introduced to prevent the comparison of unrelated goals which could precipitate 
the end of the unfolding process. Consider the atoms Append([l], y, r, 1) and Ap- 
pend([2], yl, y, 1) appearing in the LD-tree for Flatten([[l],[2j], r,2) (see Figure 1 
in the appendix) . Any sensible measure function would assign exactly the same 
weight to each atom. But, if these weights were compared, unfolding would be 
prematurely halted after four steps. Hence, this comparison must be avoided and 
this is justified by the fact that the atoms occur in separate “sub-derivations.” In 
the sonic approach, the above notions are dealt with implicitly. Figure 1 depicts 
the SLD-tree for the goal ^ Flatten([[l], [2]], r, 2) using the transformed version of 
Flatten. The depth argument of each atom may be seen as a weight as described 
above. Note that the weight of any atom in a sub-derivation (except the first) is 
implicitly derived from the weight of its direct covering ancestor by the process 
of resolution. This conceptual simplicity eliminates the need to explicitly trace 
direct covering ancestors, improving performance of the specialisation process. 

4 Unfolding Unbounded Atoms 

The main problem with the above transformation is that it only allows the un- 
folding of bounded goals. Often, as mentioned in the introduction, to achieve 
good specialisation it is necessary to unfold unbounded atoms also. This is es- 
pecially true in a logic programming setting, where partially instantiated goals 
occur very naturally even at runtime. This capability may be incorporated into 
the above scheme as follows. Although an atom may be unbounded, it may well 
have a minimum size. For example the length of the list [l,2,3|x] must be at 
least three regardless of how x may be instantiated. In fact, this minimum size 
is an accurate measure of the size of the part of the term which is partially 
instantiated and this may be used to determine an estimate of the number of 
unfolding steps necessary for this part of the term to be consumed in the spe- 
cialisation process. For example, consider the Append/3 predicate and the goal 
^ Append([l,2,3|x], y, z). Given that the minimum size of the first argument is 
three it may be estimated that at least three unfolding steps must be performed. 
Now suppose that the number of unfolding steps is fixed at one plus the mini- 
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mum (this will usually give exactly the required amount of specialisation). The 
transformed Flatten program may now be used to control the unfolding by simply 
calling ^ Append([l,2,3|x], y, z, 3). The problem here, of course, is that complete- 
ness is lost, since the goal fails if x does not become instantiated to []. To remedy 
this, an extra clause is introduced to capture the leaf nodes of the SLD-tree. The 
Append/3 predicate would therefore be transformed into the following. 

app\ Append ([], x, x, d) ^ d > 0. 

app 2 Append([u|x], y, [u|z], d) ^ d > 0 A Append(x, y, z, d — 1). 
app| Append(x, y, z, d) ^ d < 0 A Append(x, y, z, _). 

The call to Append /4 in the clause app| immediately suspends since the depth 
argument is uninstantiated. The clause is only selected when the derivation 
length has exceeded the approximated length and the effect is that a leaf node 
(residual goal) is generated precisely at that point. For this reason, such a clause 
is termed a leaf generator in the sequel. Now for the goal ^ Append([l,2,3|x], y, z, 
3) the following resultants are obtained. 

Append([l,2,3], y, [l,2,3|y], 3) ^ 

Append([l,2,3,u|x'], y, [l,2,3,u|z'], 3) ^ Append(x’, y, z’) 

Observe that the partial input data has been completely consumed in the 
unfolding process. In fact, in this example, one more unfolding step has been 
performed than is actually required to obtain an “optimal” specialisation, but 
this is due to the fact that the goal has been unfolded non-deterministically. In 
some cases, this non-deterministic unfolding may actually be desirable, but this 
is an orthogonal issue to termination (this issue will be re-examined in Section 6) . 

Furthermore, note that the SetDepth predicates must now be redefined to 
assign depths to unbounded atoms. Also a predicate such as SetDepth J\(x, z, d) 
must be defined such that d gets instantiated to the maximum of the minimum 
lengths of the lists x and z to ensure a maximal amount of unfolding. Note that 
this maximum will always be finite. 

5 Deriving Depth Bounds from Level Mappings 

The above transformations rely on a sounding analysis to determine the depths 
of derivations or unfoldings. Such an analysis may be based on exisiting ter- 
mination analyses which derive level mappings. To establish the link with the 
termination literature the depth argument in an atom during unfolding may sim- 
ply be chosen to be the level of the atom with respect to some level mapping 
used in a termination proof Whilst, in principle a depth bound for unfolding 
may be derived from any level mapping, in practice this can lead to excessive 
unfolding and, as a consequence, poor specialisation. (E.g., based on some termi- 
nation analysis, an atom might have a level mapping of 15, diminishing by 5 on 
every recursive call. One could give the atom a depth of 15, but in this case the 
value of 3 would be much more appropriate, preventing over-eager unfolding.) 
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A number of techniques have been devised to obtain accurate depth bounds 
from fairly arbitrary level mappings derived from termination analyses. Space re- 
strictions prohibit a detailed presentation here, but the techniques are extremely 
simple to apply and introduce minimal overhead (and sometimes none at all; for 
further details see [24]). It is important to note, however, that finiteness can 
always be guaranteed; the problems encoutered only relate to the quality of the 
specialisation and this is also dependent on the control of determinacy. Although 
this has been touched upon in [9] this is still a relatively unexplored area in the 
context of partial deduction. Many of the problems may disappear altogether 
with the right balance of bounded and determinate unfolding. 



6 Experiments and Benchmarks 

To gauge the efficiency and power of the sonic approach, a prototype implemen- 
tation has been devised and integrated into the ecce partial deduction system 
([13, 14, 18]). The latter is responsible for the global control and code generation 
and calls the sonic prototype for the local control. A comparison has been made 
with ECCE under the default settings, i.e. with ecce also providing the local 
control using its default unfolding rule (based on a determinate unfolding rule 
which uses the homeomorphic embedding relation < on covering ancestors to 
ensure termination). For the global control, both specialisers used conjunctive 
partial deduction ([17, 10, 5]) and characteristic trees ([18]). 

All the benchmarks are taken from the DPPD library ([13]) and were run 
on a Power Macintosh G3 266 Mhz with Mac OS 8.1 using SICStus Prolog 3 
^6 (Macintosh version 1.3). Tables 1 and 2 show respectively, the total special- 
isation times (without post-processing), and the time spent in unfolding during 
specialisation.^ In Table 1 the times to produce the generating extensions for 
the sonic approach are not included, as this is still done by hand. It is possible 
to automate this process and one purpose of hand-coding the generating exten- 
sions was to gain some insight into how this could be best achieved. In any case, 
in situations where the same program is repeatedly respecialised, this time will 
become insignificant anyway. Due to the limited precision of the statistics/2 
predicate, the figures of “0 ms” in Table 2 should be interpreted as “less than 
16 ms.” (The runtimes for the residual programs appear in Table 3 in the ap- 
pendix, which, for a more comprehensive comparison, also includes some results 
obtained by mixtus.) 

The sonic prototype implements a more agressive unfolding rule than the de- 
fault determinate unfolding rule of ECCE. This is at the expense of total trans- 
formation time (see Table 1), as it often leads to increased poly variance, but 
consequently the speed of the residual code is often improved, as can be seen in 



^ Note that, because ecce uses characteristic trees whereas the sonic prototype builds 
trace terms, running the latter involves some extra (in principle unnecessary) over- 
head. 
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Table 3.^ Default ecce settings more or less guarantee no slowdown, and this 
is reflected in Table 3, whereas the general lack of determincay control in the 
prototype sonic unfolding rule leads to two small slowdowns. There is plenty of 
room for improvement, however, on these preliminary results. For instance, the 
sonic approach is flexible enough to allow determinacy control to be incorporated 
within it. 

All in all, the sonic approach provides extremely fast unfolding combined 
with very good specialisation capabilities. Observe that the sonic approach even 
passes the KMP test, and it is thus the first offline approach to our knowledge 
which does so.^ If it were possible to extend the sonic approach to the global 
control as well, one would hopefully obtain an extremely efficient specialiser 
producing highly optimised residual code. 

Table 1. Specialisation times (total w/o post-processing) 



Benchmark 


sonic + ECCE 


ECCE 


advisor 


17 ms 


150 ms 


applast 


83 ms 


33 ms 


doubleapp 


50 ms 


34 ms 


map. reduce 


33 ms 


50 ms 


map. rev 


50 ms 


67 ms 


match. kmp 


300 ms 


166 ms 


matchapp 


66 ms 


83 ms 


maxlength 


184 ms 


200 ms 


regexp, rl 


34 ms 


400 ms 


relative 


50 ms 


166 ms 


remove 


367 ms 


400 ms 


remove2 


1049 ms 


216 ms 


reverse 


50 ms 


50 ms 


rev_acc_type 


316 ms 


83 ms 


rotateprune 


67 ms 


183 ms 


ssupply 


34 ms 


100 ms 


transpose 


50 ms 


467 ms 


upto.suml 


33 ms 


284 ms 


upto.sum2 


50 ms 


83 ms 



7 Conclusion 

The majority of termination analyses rely on the derivation of level mappings to 
prove termination. This paper has described how these level mappings may be 

® A more agressive unfolding rule, in conjunctive partial deduction, did not lead to 
improved speed under compiled code of Prolog by BIM; see [14]. So, this also depends 
on the quahty of the indexing generated by the compiler. 

One might argue that the global control is still online. Note, however, that for KMP 
no generalisation and thus no global control is actually needed. 
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Table 2. Specialisation times (unfolding) 



Benchmark 


sonic + ECCE 


ECCE 


advisor 


0 ms 


33 ms 


applast 


0 ms 


16 ms 


doubleapp 


0 ms 


0 ms 


map. reduce 


0 ms 


17 ms 


map. rev 


0 ms 


34 ms 


match. kmp 


0 ms 


99 ms 


matchapp 


0 ms 


33 ms 


maxlength 


0 ms 


67 ms 


regexp, rl 


0 ms 


383 ms 


relative 


0 ms 


166 ms 


remove 


34 ms 


201 ms 


remove2 


33 ms 


50 ms 


reverse 


16 ms 


33 ms 


rev_acc_type 


0 ms 


32 ms 


rotateprune 


0 ms 


99 ms 


ssupply 


0 ms 


67 ms 


transpose 


16 ms 


400 ms 


upto.suml 


0 ms 


168 ms 


upto.sum2 


0 ms 


66 ms 



used to obtain precise depth bounds for the control of unfolding during partial 
deduction. Thus, a solid link has been established between the fields of static ter- 
mination analysis and partial deduction enabling existing and future termination 
analyses to be used to ensure finiteness of the unfolding process. 

Furthermore, the paper has described now such depth bounds can be incor- 
porated in generating extensions. The construction of these forms the foundation 
of any offline partial deduction method whether it is based on the self-application 
or the cogen approach. This is the first offline technique which allows arbitrarily 
partially instantiated goals to be sufficiently unfolded to achieve good special- 
isation results. The technique can, surprisingly, yield even better specialisation 
than a pure online technique (and thus the choice of an offline approach does 
not necessarily entail the sacrifice of unfolding potential). This is due to the 
availability of global information in the unfolding decision making process. It is 
also, to our knowledge, the first offline approach which passes the KMP test. 

The framework admits elegant solutions to some problematic unfolding issues 
and these solutions are significantly less complex than their online counterparts. 
Of course, an online technique may still be able to make refined unfolding de- 
cisions based on the availabilty of concrete data. This strongly suggests that 
offline and online methods be combined to achieve maximal unfolding power. 
Another, possibly more challenging, avenue for further research is to extend the 
sonic approach for the global control, so that its advantages in terms of efficiency, 
termination, and specialisation power also apply at the global control level. 
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A Further Figures and Tables 



Flatten([[l], [2]], r, 2) 



^ Append([l], y, r, 1 


) A Flatten([[2j], y, 1) 


^ Append([], y, rl, 0 


) A Flatten([[2j], y, 1) 


^ FlattenI 


■[[2]]. y, 1) 


^ Append([2], yl, y, 


1) A Flatten([], yl, 0) 


^ Append([], yl, r2, i 


0) A Flatten([], yl, 0) 


^ Flatten 


'([]. yl. 0) 


i; 

Fig. 1. Unfolding of ^ 


1 

- Flatten([[l], [2]], r. 
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Table 3. Speed of the residual programs (in ms, for a large number of queries, 
interpreted code) and Speedups 



Benchmark 


Original 


sonic + ECCE 


ECCE 


MIXTUS 


advisor 


1541 ms 
1 


483 ms 
3.19 


426 ms 
3.62 


471 ms 


applast 


1563 ms 
1 


491 ms 
3.18 


471 ms 
3.32 


1250 ms 


doubleapp 


1138 ms 
1 


700 ms 
1.63 


600 ms 
1.90 


854 ms 


map. reduce 


541 ms 
1 


100 ms 
5.41 


117 ms 
4.62 


383 ms 


map. rev 


221 ms 
1 


71 ms 
3.11 


83 ms 
2.66 


138 ms 


match. kmp 


4162 ms 
1 


1812 ms 
2.30 


3166 ms 
1.31 


2521 ms 


matchapp 


1804 ms 
1 


771 ms 
2.34 


1525 ms 
1.18 


1375 ms 


maxlength 


217 ms 
1 


283 ms 
0.77 


208 ms 
1.04 


213 ms 


regexp, rl 


3067 ms 
1 


396 ms 
7.74 


604 ms 
5.08 


na 


relative 


9067 ms 
1 


17 ms 
533.35 


1487 ms 
6.10 


17 ms 


remove 


3650 ms 
1 


4466 ms 
0.82 


2783 ms 
1.31 


2916 ms 


remove2 


5792 ms 
1 


4225 ms 
1.37 


3771 ms 
1.54 


3017 ms 


reverse 


8534 ms 
1 


6317 ms 
1.35 


6900 ms 
1.24 


na 


rev_acc_type 


37391 ms 
1 


26302 ms 
1.42 


26815 ms 
1.39 


25671 ms 


rotateprune 


7350 ms 
1 


5167 ms 
1.42 


5967 ms 
1.23 


5967 ms 


ssupply 


1150 ms 
1 


79 ms 
14.56 


92 ms 
12.50 


92 ms 


transpose 


1567 ms 
1 


67 ms 


67 ms 


67 ms 


upto.suml 


6517 ms 
1 


4284 ms 
1.52 


4350 ms 
1.50 


4716 ms 


upto.sum2 


1479 ms 
1 


1008 ms 
1.47 


1008 ms 
1.47 


1008 ms 
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Abstract. We extend positive supercompilation to handle negative as 
well as positive information. This is done by instrumenting the underly- 
ing unfold rules with a small rewrite system that handles constraints on 
terms, thereby ensuring perfect information propagation. We illustrate 
this by transforming a naively specialised string matcher into an optimal 
one. The presented algorithm is guaranteed to terminate by means of 
generalisation steps. 



1 Introduction 

Turchin’s supercompiler [21] is a program transformer for functional programs 
which performs optimisations beyond partial evaluation [9] and deforestation [23] . 

Positive supercompilation [7] is a variant of Turchin’s supercompiler which 
was introduced in an attempt to study and explain the essentials of Turchin’s 
supercompiler, how it achieves its effects, and its relation to other transform- 
ers. In particular, the language of the programs to be transformed by positive 
supercompilation is a typical first-order functional language — the one usually 
studied in deforestation — which is rather different from the language Refal, 
usually adopted in connection with Turchin’s supercompiler. 

For the sake of simplicity, the positive supercompiler was designed to main- 
tain positive information only; that is, when the transformer reaches a condi- 
tional if x==x' then t else t' , the information that x = x' \s assumed to 
hold is taken into account when transforming t (by performing the substitution 
{x'.= x'} on t). In contrast, the negative information that x ^ x' must hold is 
discarded when transforming t' (since no substitution can represent this infor- 
mation!). In Turchin’s supercompiler this negative information is maintained as 
a constraint when transforming t' . Consequently, Turchin’s supercompiler can 
perform some optimisations beyond positive supercompilation. 

In this paper we present an algorithm which we call perfect supercompila- 
tion — a term essentially adopted from [6] — which is similar to Turchin’s 
supercompiler. The perfect supercompiler arises by extending the positive su- 
percompiler to take negative information into account. Thus, we retain the typ- 
ical first-order language as the language of programs to be transformed, and we 
adopt the style of presentation from positive supercompilation. 

A main contribution of the extension is to develop techniques which mani- 
pulate constraints of a rather general form. Although running implementations 
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of Turchin’s supercompiler use such techniques to some extent, the techniques 
have not been presented in the literature for Turchin’s supercompiler as far as we 
know. The only exception is the paper by Gliick and Klimov [6] which, however, 
handles constraints of a simpler form; for instance, our algorithm for normalising 
constraints has no counterpart in their technique. As another main contribution 
we generalise a technique for ensuring that positive supercompilation always 
terminates to the perfect supercompiler and prove that, indeed, perfect super- 
compilation terminates on all programs^. As far as we are aware, no version of 
Turchin’s supercompiler maintaining negative information has been presented 
which in general is guaranteed to terminate. 

The remainder of this paper is organised as follows. We first (Sect. 2) present 
a classical application of positive supercompilation (of transformers in general): 
the generation of an efficient specialised string pattern matcher from a general 
matcher and a known pattern. As is well-known, positive supercompilation gen- 
erates specialised matchers containing redundant tests. We also show how these 
redundant tests are eliminated when one uses instead perfect supercompilation. 
We then (Sect. 3) present an overview of perfect supercompilation and (Sect. 4) 
an overview of the proof that perfect supercompilation always terminates. In 
Sect. 5 we conclude and compare to related work. 



2 The Knuth-Morris-Pratt Example 



In this paper we will only consider programs written in a first-order, func- 
tional language with pattern matching and conditionals. For simplicity, pattern- 
matching functions are allowed to match with non-nested patterns on one pa- 
rameter only, and conditionals can only be used to test the equality of two values 
by means of the == operator. We will use the convention that function names 
are written slanted, variables are written in italics, and constructors are written 
in SMALL CAPS. We also use standard shorthand notation [] and h : t for the 
empty list and the list constructed from h and the tail t, respectively; we further 
use the usual notation [hi, , hn]. 

Consider the following general matcher program which takes a pattern and 
a string as input and returns TRUE iff the pattern occurs as a substring in the 
string. 



match{p, s) 
m([],ss, op, os) 
m(p:pp, ss, op, os) 
x{p,pp, [], op, os) 
x{p, pp, s : ss, op, os) 
n{op, s : ss) 



= na{p,s,p,s) 

= TRUE 

= x{p,pp,ss,op,os) 

= FALSE 

= if p==s then m{pp , ss , op , os) else n{op,os) 
= m{op, ss, op, ss) . 



^ When termination is guaranteed we cannot guarantee perfect transformation, but 
the underlying unfolding scheme is still perfect in the sense that all information is 
propagated. 
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Although this example only compares variables to variables, our method can 
manipulate more general equalities and inequalities. 

Now consider the following naively specialised matcher matcfiAAB which mat- 
ches the fixed pattern [a, a, b] with a string u by calling match: 

matchAAsiu) = match{[A, a,b], u) . 

Evaluation proceeds by comparing A to the first component of u, A to the second, 
and B to the third. If at some point the comparison fails, the process is restarted 
with the tail of u. 

This strategy is not optimal. Suppose that after matching the two occurrences 
of A in the pattern with the first two occurrences of A in the string, the B in the 
pattern fails to match yet another A in the string. Then the process is restarted 
with the string’s tail, even though it is known that the first two comparisons will 
succeed. Rather than performing these tests whose outcome is known, we should 
skip the first three occurrences of A in the original string and proceed directly 
to compare the B in the pattern with the fourth element of the original string. 
This is done in the KMP specialised matcher: 



matcIiAAB(w) 


= mAAB(u) 




mAAB([]) 


= FALSE 




mAAB(s:ss) 


= if A==s then mAsiss'^ 


) else mAAB(ss 


™ab([]) 


= FALSE 




mAB(s:ss) 


= if A==s then ms{ss) 


else mAAsiss) 


™b([]) 


= FALSE 




msis : ss) 


= if B==s then TRUE 





else if A==s then rngiss) else mAAB(ss) • 

After finding two As and a third symbol which is not a B in the string, this 
program checks (in ms) whether the third symbol of the string is an A. If so, it 
continues immediately by comparing the next symbol of the string with the B in 
the pattern (by calling mg), thereby avoiding repeated comparisons. 

Can we get this program by application of positive supercompilation to the 
naively specialised matcher? The result of this application is depicted graphi- 
cally as a process tree in Fig. 1 (it will be explained later why some nodes are 
emphasised). The root of the process tree is labelled by the initial term that 
is to be transformed (here, the naively specialised matcher). The children of a 
node a in the process tree represent possible unfoldings for the term in a; the 
edges are labelled with the assumptions made about free variables (e.g. u = []). 
Each arc in the process tree can therefore be seen as one step of transformation. 
At the same time the whole tree can be viewed as a new program, where arcs 
with labels represent tests on the input, and the leaves represent final results or 
recursive calls. 

Informally, a program can be extracted from a process tree by creating a 
new function definition for each node a that has labelled outgoing edges; the 
new function has as parameters the set of variables in a and a right-hand side 
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Fig. 1. Driving the naively specialised matcher. The children of a node a rep- 
resent possible unfoldings for the term in a; the edges are labelled with the 
assumptions made 






On Perfect Supercompilation 117 

created from the children of a. In fact, the program corresponding to the tree 
in Fig. 1 is the following: 

mAAB([]) = false 

mAAB(s : ss) = if A==s then mAB(ss) else nAAB(ss,s) 

™ab([]) = false 

mAB(s:ss) = if A==s then mB(ss) else nAB(ss,s) 

™b([]) = FALSE 

m^is : ss) = if B==s then TRUE else iib{ss,s) 

nAAB{ss,s) = mAAB(ss) 

Rab(sS)S) = if A==s then mAB(ss) else nAAB(ss>s) 

nB(ss,s) = if A==s then mB(ss) else iiab{ss,s) . 

The term m^ABiu) in this program is more efficient than match{[A, A, b], u) in the 
original program. In fact, this is the desired KMP specialised matcher, except 
for the redundant test a==s in Uab (and the redundant argument in Uaab)- The 
reason for the redundant test a==s is that positive supercompilation ignores 
negative information: when proceeding to the false branch of the conditional 
(from the original program) 

if A==s then m([B], ss, [a, A, b], A : s : ss) else u([a, A, b], A : s : ss) , (*) 

the information that A 7^ s holds is forgotten. Therefore, the test is repeated in 
the subsequent conditional 

if A==s then m([A, b], ss, [a, A, b], s : ss) else u([a, A, b], s : ss) . (+) 

In contrast, with perfect supercompilation, this information is maintained as a 
constraint, and can be used to decide that the conditional (+) has only one 
possible outcome. The tree would therefore continue below the node (+), and 
the resulting program would skip the superfluous test and have a recursive call 
back to the first branching node, x(a, [a, b], u, [a, A, b], u); this is exactly the 
KMP specialised matcher. 

3 Overview of Perfect Supercompilation 

In this and the next section we will give an informal account of the supercompi- 
lation algorithm. For proofs, examples and in-depth treatment of the algorithm, 
see [15]. 

Despite the intended informality we will need the following definitions. Let 
X, y, z range over variables from the set X. Let c, / and g range over fixed arity 
constructor, function and pattern-matching-function names in the finite sets C, 
F and G, respectively. Let p range over patterns of the form c{x \, . . . , x„) and let 
t, u, s range over terms. By var(t) we denote the variables in t, and we let 0 range 
over substitutions, written {x\ := ti, . . . , Xn ■= tn}', application of substitutions 
is defined as usual and written prefix. Finally, we write /(. . .) = t to denote that 
function / is defined in the program under consideration. 
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Perfect supercompilation of a program is carried out in two phases. First, 
a model of the subject program is constructed in form of a constrained process 
tree. Second, a new program is extracted from the constrained process tree. A 
constrained process tree is similar to the tree in Fig. 1, but each node is labelled 
by a term and a set of constraints. From here on we will not distinguish between 
constrained and unconstrained process trees, and we will refer to some part of 
the label of a node a simply by saying “node a contains ... ” . 

The root of the process tree is labelled by the initial term that is to be trans- 
formed, together with an empty constraint system. The process tree is developed 
by repeated unfoldings of the terms in the leaves. The rules that govern the un- 
folding of terms have been constructed by extending the small-step semantics of 
the language by rules that speculatively execute tests that depend on variables. 
For each possible outcome of a test, a child is added and information about the 
test that has been conducted is appended to the current constraint system. The 
extended constraint system is then passed on to the child that resulted from 
the speculative execution. Our constraint systems (a subset of the ones defined 
in [2]) are restricted kinds of conjunctive normal forms of formulae of the form 




where a, b are terms that consist of variables and constructors only, i. e. 

a,b ::= x \ c(ai,...,a„) . 

The constraint systems are used to prune branches from the process tree: spec- 
ulative execution of a test that results in a constraint system that cannot be 
satisfied will not produce a new child. For instance, consider the again the 
conditional (+); blindly unfolding this term would result in a node with two 
children: 



(if A==s then m([A, b], ss, [a, A, b], s : ss) else u([a, A, b], s : ss 




But since we have inherited the constraint system A yf s from the conditional (*), 
the left child will not be produced because the resulting constraint system 
Ayf s A A = sis not satisfiable. More precisely, for a constraint system to 
be satisfiable, it must be possible to assign values to the variables in the system 
such that the constraints are satisfied. A constraint system is thus satisfiable 
iff there exists a substitution 0 such that, for each equation a = a', 6a will be 
syntactically identical to 6a' , and likewise, for each disequation b yf b', 6b will 
be syntactically different from 6b' . To decide the satisfiability of a constraint 
system R, we first apply a set of rewrite rules to bring R into a normal form. 
The core of these rewrite rules (a modified version of the ones presented in [2]) is 
shown in Fig. 2. Additional control on these rules ensure that non-deterministic. 
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Fig. 2. Rewrite system for normalisation of constraint systems. _L represents an 
unsatisfiable element, T represents the trivially satisfiable element, and • stands 
for an arbitrary part of a formula 



exhaustive application of the rewrite rules to any constraint system terminates 
and results in a constraint system in normal form. A constraint system in normal 
form is either _L (false), T (true), or of the form 

m / I 

A /\ I \/ Uj^k 7^ bj^k 

j=i \k=i 

When no type information about the variables in a constraint system is 
present, a constraint system in normal form is satisfiable exactly when it is 
different from _L. However, when it is known that a variable x can assume a 
finite set of values only, it is necessary to verify that there indeed exists a value 
which, when assigned to x, satisfies the constraint system. For instance, consider 
the constraint system 





X y Ay z A z X 

where all variables have boolean type. This system is in normal form and there- 
fore appears to be satisfiable — but it is not possible to assign values FALSE 
or TRUE to the variables such that the system is satisfiable. When a constraint 
system R is in normal form, it is thus necessary to systematically try out all 
possible combinations of value assignments for variables with known, finitely- 
valued types. This is done by instantiating such variables in R, which possibly 
will call for further rewrite steps to take R into a normal form, and so on until 
there are no more finitely- valued variables left. If R now is different from T, R 
is satisfiable [15]. 
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We can now show how constraint systems can be used to guide the con- 
struction of the process tree. Every term t in the process tree is associated with 
a constraint system R, denoted {t, R) . The complete set of unfold rules is pre- 
sented in Fig. 3. Rules (A)-(B), (D), (E)-(F)^ and (G)-(J) correspond to normal 
evaluation with respect to the semantics of language^. Rules (C) and (E)-(F)^ 
perform speculative execution of a term based on the information in the associ- 
ated constraint system. 

Rule (C) instantiates a free variable y to the pattern c(yi, . . . , ym) taken from 
the function definition (using fresh variables). This is achieved by appending 
the equation y = c{yi,...,ym) to the current constraint system. If the new 
constraint system is satisfiable, the function application can be unfolded. In the 
same manner, rules (E) and (F) handle conditional expressions where general 
equations and disequations can be appended to the constraint system. Rule (K) 
finally separates the resulting constraint system R into positive and negative 
information by normalising R: the positive information (of the form /\x = a) can 
be regarded as a substitution, which can then be separated from the normalised 
R; we denote this separation by i?' = {0 /\ R"). The positive information can 
then be propagated to the context by applying the substitution 0 to the whole 
term®. 

Unfolding of a branch is stopped if the leaf in that branch is a value or if 
an ancestor node covers (explained below) all possible executions that can arise 
from the leaf. The latter case constitutes a fold operation which will eventually 
result in a recursive call in the derived program. 

We say that a node covers another node if the terms of the two nodes are equal 
up to renaming of variables and the constraint system in the leaf is at least as 
restrictive as the one in its ancestor. Intuitively speaking, if these two conditions 
are met, any real computation performed by the leaf can also be performed by 
the ancestor; we can then safely produce a recursive call in the derived program. 
In [15] an algorithm is presented that gives a safe approximation to the question 
“is R more restrictive than i?'?”. 

If we look at the process tree in Fig. 1, we will see that some parts of the tree 
are created by deterministic unfolding, i.e. they each consist of a single path. This 
is a good sign, since it means that the path represents local computations that 
will always be carried out when the program is in this particular state, regardless 
of the uninstantiated variables. We have thus precomputed these intermediate 
transitions once and for all — as done in partial evaluation — and we can omit 
the intermediate steps and simply remember the result. 



^ Without free variables in a and a' . 

® The intended semantics of our language is evaluation to weak head normal form, 
except for comparison in conditionals where the terms to be compared are fully 
evaluated before the comparison is carried out. For simplicity, the unfolding rules 
are call-by-name which, unfortunately, can give rise to duplication of computation. 

^ With free variables in a or a' . 

® The positive information is re-injected into the context because such re-injection 
neatly disposes of irrelevant information about variables that are no longer present. 
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. . . ,tn),R) ({xi :=tl, . . . , x„-.= t„}t, R) 



gr(c(xi, . . . ,Xm),Xm+l, . . . ,X„)=t 
(5^(c(tl,. . . ^ tm'j ^ ^ j tji'j ^ R'^ I ^ — tl,. . . , Xn • — R) 

g{p,xi, . . . ,Xn) =t R' = RA[x = p] satisfiable(_R') 
{g{x,ti,. . . ,tn),R) 1-^ ({xi :=ti, . . . ,x„ R') 



ff(p,xi,...,x„)=t (t,R) {t',R') 

{g{t,ti,. . . ,tn),R) {g{t',ti,. . . ,tn),R'} 



R' = R A[a = a'] satisfiable(_R') 

^ ^ {if a==a' then t else t',R) {t,R') 



R' = R A[a a'] satisfiable(_R') 

^ ^ {if a==a' then t else t',R) i— > {t',R') 



{ti,R) P^{t[,R') 

{if ti==t2 then ta else ti,R) i— > {if t'i==t2 then else ti,R') 

^ {t2,-R) \^{t'2,R') 

{if a==t2 then ta else ti,R) i— > {if a==t'2 then ta else ti,R') 



{t, R) {t' , R') 

{t, R) , R') 



{t,R) \^{t',R') 



{t,R)p^{t',R') 




Fig. 3 . Unfold rules with perfect information propagation 
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Creation of a process tree in the manner just described does not always ter- 
minate since infinite process trees can be produced. To keep the process trees 
finite, we ensure that no infinite branches are produced. It turns out that in 
every infinite branch, there must be a term that homeomorphically embeds an 
ancestor (this is known as Kruskal’s Tree Theorem). The homeomorphic em- 
bedding relation < is the smallest relation on terms such that, for any symbol 
HeCUFuGU {ifthenelse}, 

3i e {I, . . . ,n} : t < t'- Wi e {I, . . . ,n} : U < t'- 

x<y t 

When a term t' in a leaf homeomorphically embeds a term t in an ancestor, 
there is thus a danger of producing an infinite branch. In such a situation, t or 
t' is split up by means of a generalisation step. 

Definition 1 (Generalisation). 

1. A term u is an instance of term t, denoted u > t, if there exists a substitution 
0 such that Ot = u. 

2. A generalisation of two terms t, u is a term s such that t>s and u>s. 

3. A most specific generalisation (msg) of two terms t, u is a generalisation s 

such that, for all generalisation s' of t,u, s > s' . (There exists exactly one 
msg of t, u modulo renaming) . □ 

A generalisation step on a process tree calculates the msg of the terms t, t' in 
two nodes o, o'; the msg is then used to divide one of the nodes into subterms 
that can be unfolded independently: 




where s is the msg of t and t' , and t = s{a;i := ui, . . . , := Un}- Which of the 

nodes t, t' that is split up depends on how similar the nodes are (this will be made 
more precise below). The introduction of let-terms are merely for notational 
convenience; they will be unrolled in the derived program. 

We can now sketch the full supercompilation algorithm. To ensure termina- 
tion and, at the same time, provide reasonable specialisation, we partition the 
nodes in the process tree into three categories (in the lines of [18]): 

1. nodes containing let-terms, 

2. global nodes, and 

3. local nodes. 
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Global nodes are those that represent speculative execution or final results (both 
of which must be present in the derived program). Local nodes are those nodes 
that are not global and does not contain let-terms. For example, in Fig. 1 the 
set of local nodes are indicated by dotted frames (there are no nodes containing 
let-terms since there is no need for generalisation in that particular example). 
This partitioning of the nodes is used to control the unfolding. 

Definition 2 (Relevant ancestor). Let T be a process tree and let the set of 
relevant ancestors relanc(T, a) of a node o in T be defined thus: 

{ 0, if a contains a let-term 

all ancestors that are global, if a is global 
all local ancestors, if a is local 

where the loeal ancestors to a is all ancestors that are local up to the first 
common ancestor that is global. □ 

For an example, consider the process tree in Fig. 1; the local node x(a, [b], 
s : ss, [a, A, b], A : s : ss) near the bottom has as local ancestors all ancestors up to 
and including the node u([a, A, b], A : A : s : ss). 

Definition 3 (Drive). Let T be a process tree and a a node in T. Then 

1. T{a) denotes the label of node a. 

2. T{a:=T'} denotes a new tree that is identical to T except that the subtree 
rooted at a has been replaced by T' . 

3. e denotes the root node of a tree. 

4. If {(ti,i?i), . . . , {tn-,Rn)} = {{t,R)\T{a) ^ (t,i?)}, then 



drive(i/', a) = T{q;:= T(q;) } 



(ti, R\) 




□ 



Definition 4 (Finished). A leaf o in a process tree T is finished if one of the 
following conditions is satisfied: 

1. T{a) = (c(), . . . ) for some constructor c. 

2. T{a) = {x, . . .) for some variable x. 

3. There is an ancestor a' to a such that (a) a, o! are global nodes, and (b) 
if T{a) = {t,R) and T{a') = {t',R') then t is a renaming of t' and R' is at 
least as restrictive as R. 

A tree T is said to be finished when all leaves are finished. □ 
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With these definitions, we can sketch the supercompilation algorithm thus: 
input a term t 

let T consist of a single node labelled {t, T) 
while T is not finished begin 

let a = {t, R) be an unfinished leaf in T 

if Va' = € relanc(i/', a) then T = drive(i/', a) 

else begin 

let a' = {t',R') G relanc(i/', a) such that t' < t 
if R' is more restrictive than R then T = T{a' := {t' ,T)} 
else if t>t' then T = T{o; := (generalise(t, t'), i?)} 
else T = T{q;':= (generalise(t', t), i?')} 

end 

end 

output T 

The transformed program can be extracted from the process tree by examination 
of the global nodes (collecting the set of free variables) and the labels on the 
edges. 

4 Overview of the Termination Proof 

A language-independent framework for proving termination of abstract program 
transformers has been presented in [17], where sufficient conditions have been 
established for abstract program transformers to terminate. In this section we 
will give a very rough sketch of how this framework have been used to prove 
termination of our algorithm. 

An abstract program transformer is a map from trees to trees, such that a 
single step of transformation is carried out by each application of the transformer. 
Termination then amounts to a certain form of convergence of the sequences of 
trees obtained by repeatedly applying the transformer. 

For a transformer to ht the framework, it is sufficient to ensure that 

1. the transformer converges, in the sense that for each transformation step, 
the difference between two consecutive trees lessens. More precisely, in the 
sequence of trees produced by the transformation, for any depth d there must 
be some point from which every two consecutive trees are identical down to 
depth d; and 

2. the transformer maintains some invariant such that only hnite trees are 
produced. 

By induction on the depth of the trees produced, the former can be proved by 
the fact that the algorithm either 

1. adds new leaves to a tree which trivially makes consecutive trees identical 
at an increasing depth, or 
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2. generalises a node, i.e. replaces a subtree by node containing a let-term (or 
a node containing the empty constraint system T). Since generalisation only 
occurs on terms which are not let-terms (or contain non-empty constraint 
systems, respectively), a node can be generalised at most twice. 

The latter is ensured because, in every proces tree, 

1. a path that consists of let-terms only, must be finite since each let-term t 
will have subsets of t as children proper; thus the size of the nodes in such a 
path strictly decreases. 

2. all other nodes are not allowed to homeomorphically embed an ancestors 
(except for finished nodes, but these are all leaves). 

5 Conclusion and Related Work 

We have presented an algorithm for a supercompiler for a first-order functional 
language that maintains positive as well as negative information. The algorithm 
is guaranteed to terminate on all programs, it is strong enough to pass the so- 
called KMP-test. 

In [22], Turchin briefly describes how the latest version of his supercompiler 
utilises contraction and restriction patterns in driving Refal graphs, the under- 
lying representation of Refal programs. It seems that the resolution of clashes 
between assignments and contractions/restrictions can achieve propagation of 
negative information that — to some extent — provides the power equivalent to 
what has been presented in the present paper, but the exact relationship is at 
present unclear to us. 

In the field of partial evaluation, Consel and Danvy [3] have described how 
negative information can be incorporated into a naively specialised matcher, 
thereby achieving effects similar to those described in the present paper. This, 
however, is achieved by a non-trivial rewrite of the subject program before partial 
evaluation is applied, thus rendering full automation difficult. 

In the case of Generalised Partial Computation [5], Takano has presented a 
transformation technique [19] that exceeds the power of both Turchin’s super- 
compiler and perfect supercompilation. This extra power, however, stems from 
an unspecified theorem prover that needs to be fed the properties about prim- 
itive functions in the language, axioms for the data structures employed in the 
program under consideration, etc. In [20] the theorem prover is replaced by a 
congruence closure algorithm [14], which allows for the automatic generation of a 
KMP-matcher from a naively specialised algorithm when some properties about 
list structures are provided. In comparison to supercompilation. Generalised Par- 
tial Computation as formulated by Takano has no concept of generalisation and 
will therefore terminate only for a small class of programs. 

When one abandons simple functional languages (as treated in the present 
paper) and considers logic programming and constraint logic programming, sev- 
eral accounts exist of equivalent transformation power, e.g. [16, 8, 10, 11]. In 
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these frameworks, search and/or constraint solving facilities of the logic lan- 
guage provides the necessary machinery to avoid redundant computations. In 
this field, great efforts have been made to produce optimal specialisation, and 
at the same time to ensure termination, see e.g. [12, 13]. 

Acknowledgements. Thanks to Robert Gliick, Neil D. Jones, Laura Lafave and 
Michael Leuschel for discussions and comments. Thanks to Peter Sestoft for 
many insightful comments to [15]. 

References 

[1] ACM. Proceeding of the ACM SIGPLAN Syposium on Partial Evaluation and 
Semantics-Based Program Manipulation, volume 26(9) of ACM SIGPLAN No- 
tices, New York, September 1991. ACM Press. 

[2] Hubert Comon and Pierre Lescanne. Equational problems and disunification. 
Journal of Symbolic Computation, 7(3-4) :371-425, March-April 1989. 

[3] Charles Consel and Olivier Danvy. Partial evaluation of pattern matching in 
strings. Information Processing Letters, 30(2):79-86, 1989. 

[4] O. Danvy, R. Gliick, and P. Thiemann, editors. Partial Evaluation, volume 1110 
of Lecture Notes in Computer Science. Springer- Verlag, 1996. 

[5] Y. Futamura and K. Nogi. Generalized partial computation. In D. Bjprner, 
A.P. Ershov, and N.D. Jones, editors, Partial Evaluation and Mixed Computation, 
pages 133-151, Amsterdam, 1988. North-Holland. 

[6] R. Gliick and A.V. Klimov. Occam’s razor in metacomputation: the notion of a 
perfect process tree. In P. Cousot, M. Falaschi, G. File, and G. Rauzy, editors. 
Workshop on Static Analysis, volume 724 of Lecture Notes in Computer Science, 
pages 112-123. Springer- Verlag, 1993. 

[7] R. Gliick and M.H. Sprensen. A roadmap to metacomputation by supercompila- 
tion. In Danvy et al. [4], pages 137-160. 

[8] T.J. Hickey and D. Smith. Toward the partial evaluation of GLP languages. In 
PEPM’91 [1], pages 43-51. 

[9] N.D. Jones, G.K. Gomard, and P. Sestoft. Partial Evaluation and Automatic 
Program Generation. Prentice-Hall, 1993. 

[10] L. Lafave and J. P. Gallagher. Partial evaluation of functional logic programs 
in rewriting-based languages. Technical Report CSTR-97-001, Department of 
Gomputer Science, University of Bristol, March 1997. 

[11] L. Lafave and J. P. Gallagher. Extending the power of automatic constraint-based 
partial evaluators. ACM Computing Surveys, 30(3es), September 1998. Article 
15. 

[12] Michael Leuschel and Danny De Schreye. Constrained partial deduction and the 
preservation of characteristic trees. New Generation Computing, 1997. 

[13] Michael Leuschel, Bern Martens, and Danny De Schreye. Controlling general- 
ization and polyvariance in partial deduction of normal logic programs. ACM 
Transactions on Programming Languages and Systems, 20(l):208-258, January 
1998. 

[14] Greg Nelson and Derek C. Oppen. East decision procedures based on congruence 
closure. Journal of the ACM, 27(2):356-364, April 1980. 

[15] J. P. Secher. Perfect supercompilation. Technical Report 99/01, Department of 
Computer Science, University of Copenhagen, 1999. 




On Perfect Supercompilation 127 



[16] D. Smith. Partial evaluation of pattern matching in constraint logic programming. 
In PEPM’91 [1], pages 62-71. 

[17] M.H.B. Sprensen. Convergence of program transformers in the metric space of 
trees. In J. Jeuring, editor, Mathematics of Program Construction, volume 1422 
of Lecture Notes in Computer Science, pages 315-337. Springer- Verlag, 1998. 

[18] M. H. Srensen and R. Gliick. Introduction to supercompilation. In DIKU Summer 
school on Partial Evaluation, Lecture Notes in Computer Science. Springer- Verlag, 
to appear. 

[19] A. Takano. Generalized partial computation for a lazy functional language. In 
PEPM’91 [1], pages 1-11. 

[20] A. Takano. Generalized partial computation using disunihcation to solve con- 
straints. In M. Rusinowitch and J.L. Remy, editors, Conditional Term Rewriting 
Systems. Proceedings, volume 656 of Lecture Notes in Computer Science, pages 
424-428. Springer- Verlag, 1993. 

[21] V.F. Turchin. The concept of a supercompiler. ACM Transactions on Program- 
ming Languages and Systems, 8(3):292-325, 1986. 

[22] V.F. Turchin. Metacomputation: Metasystem transition plus Supercompilation. 
In Danvy et al. [4], pages 481-510. 

[23] P.L. Wadler. Deforestation: Transforming programs to eliminate intermediate 
trees. Theoretical Computer Science, 73:231-248, 1990. 




Linear Time Self-Interpretation 
of the Pure Lambda Calculus 



Tor ben Mogensen 

DIKU, University of Copenhagen, Denmark 
Universitetsparken 1, DK-2100 Copenhagen O, Denmark 
phone: (+45) 35321404, fax: (+45) 35321401 
torbenmOdiku . dk 



Abstract. We show that linear time self-interpretation of the pure un- 
typed lambda calculus is possible. The present paper shows this result for 
reduction to weak head normal form under call- by- name, call-by- value 
and call-by-need. 

We use operational semantics to define each reduction strategy. For each 
of these we show a simulation lemma that states that each inference step 
in the evaluation of a term by the operational semantics is simulated by 
a sequence of steps in evaluation of the self-interpreter applied to the 
term. 

By assigning costs to the inference rules in the operational semantics, 
we can compare the cost of normal evaluation and self-interpretation. 
Three different cost-measures are used: number of beta-reductions, cost 
of a substitution-based implementation and cost of an environment-based 
implementation. 

For call-by-need we use a non-deterministic semantics, which simplifies 
the proof considerably. 



1 Program and Data Representation 

In order to talk about self-interpretation of the pure lambda calculus, we must 
consider how to represent programs as data. 

We will use the representation defined (for closed terms) in [5]: 

[M] =Xa.Xb.M 
where 

X = X 

P Q = a P Q 
Xx.P = b Xx.P 

where M has been renamed so the variables a and b do not occur anywhere. = 
is alpha-equivalence. We get an exceedingly simple self-interpreter: 

selfint = Xrn.rn I I 

where / = Xx.x. It is trivial to prove that selfint \M] M. 
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2 Linear Time Self-Interpretation Using Call-by-Name 
Reduction 

Call-by-name evaluation can be described by the inference rules: 

p\-\x.M => {Xx.M, p) (LAMBDA) 



p'^M W 
phx ^ W 



where p(x) = {M,p') 



(VAR) 



p\~ M ^ (Xx.M' , p') 



p'[x^{N,p)]^M' ^ W 



phM JV^W 



(BETA ) 



We can define various cost measures by assigning costs to uses of the inference 
rules in an evaluation tree. For example, we can count beta reductions by letting 
each use of the (BETA ) rule count 1 and not charge anything for the other rules. 
But we can also define more fine-grained (and more realistic) cost measures by 
assigning different costs. 

For lack of space, we omit showing how the inference rules can be used to 
derive the initial stages of self-interpretation of a closed term M . These stages, 
however, define the relation between the environments used in normal evaluation 
and in self-interpretation: 



D =P2 

p[x^{S,fJ)\ = p[x^{S,p')] 
where 

P2 =[a^{l,pi),b^{l,pi)] 

Pi =[m^{\M],[])] 

The empty environment is denoted [] . The M referred to in pi is the entire term 
being interpreted. Note that \p\ = \p\ -L 2. We need a simulation lemma: 

Lemma 1. If we from the eall-by-name inference rules can derive the evaluation 
p\- N =i> (Ay. IF, p') then we can also derive the evaluation p\~N ^ (Ay. IF, p')., 

which we prove in figure 1. We use in this (and the following proofs) a notation 
where “• • •” refers to an unspecified proof tree. This is to indicate where an 
induction step is used: If normal evaluation has a proof tree indicated by 
we replace this in the simulation by a proof tree that by induction is assumed 
to exist. This proof tree is in the simulation also indicated by Semantic 

variables (place holders) in the conclusion of a rule where the premise is “ • • •” , 
can be considered existentially quantified, like variables in the premise of an 
inference rule typically are. 

By assigning costs to the inference rules, we can count the costs for normal 
evaluation and self-interpretation and hence prove linear-time self-interpretation. 
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We prove lemma 1 by induction over the the evaluation tree with N at its root: 
N = x: Let p{x) = {S, p")\ 



Normal evaluation: 



p"\-S ^{Xy.W, p') 
p\-x^[\y.W, p') 



Self-interpretation: 



piiyS ^{Xy.W, p') 
phx =► {Xy.W, p>) 



N = Xy.W: p\-N ^ (Xy.W, p) is a leaf tree. N = b (Xy.W), so we get 



pi\-I ^(l,pi) p\-XyW ^(Xy.W,p) 

p'^b ^ (Xz.z, pi) pi[z\-^ (Xy.W ,'p)\\- z ^ (Xy.W ,])) 

75b b (XyW) ^(XyW,p) 

N = Ni N 2 : The normal evaluation tree is 



p\-Ni ^(Xv.Ns, p) p[v^(N 2 ,p)\^Ni, ^(XyW,~p) 
p\~N\ N 2 =^(Xy.W, p') 

We have N = a Ni N 2 , so we get (by induction) the following tree for self- interpretation 



pi\- !=>(!, pi) pb iVi ^(Xv.Ns, p") 

■p\-a^(Xz.z,pi) p^[z^(Wl, p)]hz^(Xvl^, 

pba Ni^(Xv.N3, p") p"[v^(N 2, 'p)\\- N3^(Xy.W , p') 

phaWiT^^(XyW, 7 ) 

Fig. 1. Proof of lemma 1 



We start by counting beta reductions. For this, we let each use of the (BETA) 
rule count 1 and the other rules count 0. 

The (not shown) tree for the initial stages of self-interpretation uses the beta 
rule three times, so this tree has cost 3. For the remainder of the computations 
we use this lemma: 

Lemma 2. If derivation of p\~ N (Xy.W, p') uses n beta-reductions, then 
derivation ofp\~N (Xy.W, p') uses 3n+ 1 beta-reductions. 

The proof is done by induction over the structure of the evaluation tree, using 
the proof of lemma 1 as skeleton. 

N = x: Neither normal evaluation nor the self-interpretation uses the (BETA ) 
rule, so the result follows by induction on the subtree. 

N = Xy.W: The normal evaluation tree has cost 0 while self- interpretation 
uses the (BETA) rule once. Since 3 • 0 -1- 1 = 1, we are done. 

N = Ni N 2 - Assuming the subtrees have costs k\ and k 2 respectively, the 
total cost of normal evaluation is + /C 2 + 1. By induction, the cost for the 
subtrees for self-interpretation are 3ki + 1 and 3k2 + 1 and the tree uses (BETA ) 
twice, so the total cost is 3k\ + 3k2 + 4 = 3(k\ -b /c 2 + 1) + 1, which is what we 
want. □ 
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By adding the cost for the initial states of self-interpretation, we get: 



Theorem 1. If a closed term M via the call-by-name semantics evaluates to 
a weak head normal form (WHNF) using n beta reductions, then selfint \M] 
evaluates to a WHNF using 3n + 4 beta reductions. 



2.1 A More Realistic Cost Measure 

Just counting beta reductions is a fairly crude way of measuring the cost of 
reduction of lambda terms. In this section and the next we will study measures 
that emulate common methods for implementing functional languages. 

The first of these is (simplified) graph rewriting. In graph rewriting, a beta- 
reduction is implemented by making a new copy of the body of the function 
and inserting the argument in place of the variables. This has a cost which is 
proportional to the size of the function that is applied. Hence, we will use a cost 
measure that for each use of the (BETA) rule has a cost equal to the size of 
the function iXx.M') that is applied. The other rules still count 0, as the use of 
environments and closures in the inference rules do not directly correspond to 
actions in graph rewriting. Instead, we will treat each closure ( P, p) as the term 
obtained by substituting the free variables in P by the values bound to them in 
p, after the same has been done recursively to these values. More formally, we 
define the function unfold by: 

unfold{P, 0) = P 

unfold(p, p[xi-^{Q, p')]) = unfold{P, p)[x\unfold{Q, p')] 

We need a small lemma 



Lemma 3. unfold(P, p) = unfold(P, p)[a\/][6\/]. 

We prove this by induction over the definition of unfold: 
unfold{P, 0) = P: 



unfold{P, []) 

= unfold{P, p:i) 

= unfold{P , p 2 )[b\unfold{l , pi)] 

= unfold{P , p 2 )[b\I] 

= unfold{P[a\unfold{I , pi)], [])[6\/] 
= P[a\J][6\J] 
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unfold{F, p[x^{Q, p')]) = unfold{F, p)[x\unfold{Q, p")]: 



unfold{P, p[x>-^{Q, p')]) 

= unfold{P,p[x>-^{Q,p')]) 

by definition of J) 

= unfold{P ,p)[x\unfold{Q , p')] 

by definition of unfold 

= unfold{P, p)[a\I][b\I][x\unfold{Q , rft.o')[a\/][6\/]] 

by induction 

= unfold{P, p) [x\unfold{Q, rho')] [a\J] [6\J] 

= unfold{F, p)[x\unfold{Q , rfio')][a\/][6\J] 



□ 

We count the size of a term as the number of nodes in the syntax tree, i.e. 
one for each variable occurrence plus one for each application and one for each 
abstraction. It is easy to see that the size of P[a\/][6\J] is strictly less than 4 
times the size of P. 

We first count the cost of the initial part of the tree to be \selfint\ = 8 for 
the first beta reduction, | \M] \ < 3\M\ for the second and the size of Xb.M with 
a replaced by / (< 4|M|) for the third, for a total cost less than 7\M\ + 8. 

We now proceed with the lemma 

Lemma 4. If derivation of p\~ N ^ {\y.W, p') has cost c, then derivation of 
=> {Xy.W, p') has cost at most 4c + 2. 

Again we prove this by induction following the structure of the proof for 
lemma 1. 

N = x: Neither normal evaluation nor the self-interpretation uses the (BETA ) 
rule, so the result follows by induction on the subtrees. 

N = Xy.W'. The normal evaluation tree has cost 0 while self- interpretation 
uses the (BETA ) rule once. The applied function is Xz.z which has size 2, so we 
have what we need. 

N = N\ N 2 - Assuming the subtrees have costs k\ and k 2 respectively, the to- 
tal cost of normal evaluation is k\+k 2 +s, where s is the size of unfold{Xv.N^, p"). 
By induction, the cost for the subtrees for self- interpretation are at most 4/ci + 2 
and 4 /c 2 + 2. The tree uses (BETA ) twice, once for the function Xz.z (size 2) and 
once for unfold{Xv.Ns, p") = Xv.unfold{Ns, p")[a\/][&\/]. 

Since the size of unfold{N^, p")[a\/][6\J] is strictly less than 4 times the size 
of unfold{N 3 , p"), we have that the size of Xv.unfold{N^, p")[a\/][6\J] is at most 
A\unfold{N^, p”) \ — 1 -I- 1 = 4(s — 1). Hence, we have a total cost bounded by 
Ak\ + 2 + 4/c 2 + 2 + 2 + 4(s — 1) < 4(/ci + /c 2 + s) + 2, which is what we needed. 

□ 

By combining lemma 4 with the start-up cost of 7|M| +8, we get the theorem 

Theorem 2. If a closed term M via the call-by-name semantics evaluates to a 
WHNF in cost c, selfint |~M] evaluates to a WHNF in cost at most 4c+7|M| + 10. 
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The start-up cost proportional to the size of M is unavoidable, regardless 
of how lambda terms are represented and how the self-interpreter works. We 
required representations to be in normal form, so to perform any evaluation 
that depends on the representation, we will have to apply the representation to 
one or more arguments, which by our measure has a cost proportional to the size 
of the representation, which can not be less than linear in the size of the term. 



2.2 Environment-Based Cost 

Another common method for implementing call- by-name lambda calculus is us- 
ing environments and closures, much as indicated by the inference rules. The 
cost measure used for an environment- based implementation depends on how 
the environments are implemented. Typical data structures for environments 
are linked lists and frames. 

Using a linked list, a new variable is added to the front of the list at unit 
cost, but accessing a variable equires a walk down the linked list and hence has 
a cost that depends on the position of the variable in the environment. With 
the chosen interpreter, we can not get linear time self-interpretation if linked-list 
environments are used, as looking up the two specail variables a and b has a cost 
that depends on the size of the environment, which again depends on the size of 
the program. 

If frames are used, a new extended copy of the environment is built every 
time a new variable is added to it. This has cost proportional to the size of 
the built environment, but accessing a variable in the environment is now using 
aconstant offset from the base of the frame, which is unit cost. We shall see 
below that we can get linear time self-interpretation when frames are used to 
represent environments. 

Our cost measure now counts each use of the (VAR) or (LAMBDA) rule as 
1 and each use of the (BETA ) rule as the size of the new frame, i.e. |/o'| + 1. 

We first note that the cost of the initial part of the evaluation tree is 8. We 
then state and prove the following lemma: 

Lemma 5. If derivation of p\~ N => {Xy.W, p') has cost c, then derivation of 
ph A" => {Xy.W, p') has cost at most 8c. 

N = x: Both normal evaluation and self- interpretation use the (VAR) rule 
once, so if the cost of evaluating the contents of the variable is k, the total 
evaluation cost is k + 1. By induction, self-interpretation of the contents costs 
at most 8k, for a total self-interpretation cost of 8k + 1, which is less than the 
8{k + 1) limit. 

N = Xy.W: The normal evaluation tree has cost 1, for a single use of the 
(VAR) rule. Self- interpretation uses (VAR) and (LAMBDA) twice each and the 
(BETA ) rule once. The size of the expanded environment is 2, so we have a total 
cost of 6, which is less than 8 times the cost of normal evaluation. 

N = N\ N 2 : Assuming the subtrees have costs ki and k 2 respectively, the 
total cost of normal evaluation is ki + k 2 + \p"\ + 1- By induction, the cost 
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for the subtrees for self-interpretation are at most 8/ci and 8/C2. The tree uses 
(VAR) twice, (LAMBDA) once and (BETA) twice, once for the function Xz.z 
(where the size of the expanded environment is 2) and once for { Xv.N^, p"). Since 
\p''\ = \p”\ + 2, the total cost is bounded by 8/ci + 8 k 2 -b 2 -b 1 -b 2 -|- \p"\ -b 3 = 
8 ki + 8 k 2 + \p"\ + 8, which is less than the budget of 8 {k\ + k 2 + \p"\ + !)■ 

□ 

By adding the start-up cost of 8 to the cost found in lemma 5, we get: 

Theorem 3. If a closed term M evaluates to a WHNF in eost c (using the 
environment-based cost function), selfint \M] evaluates to a WHNF in eost at 
most 8c + 8. 



3 Linear Time Self-Interpretation Using Call-by- Value 
Reduction 



We define call-by-value reduction by the inference rules 



phXx.M => (Xx.M, p) (LAMBDA) 



p'^x^p{x) (VARY) 



p\-M ^ {Xx.M' , p') p^N^V p'[x^V]^M' 

pLM N 



(BETAV) 



We again omit the derivation of the initial stages of self-interpretation. We will 
slightly change definition of p to reflect that variables are bound to values, i.e., 
(closures of) terms in WHNF: 



D =P3 

p[x>-^{Xx.P, p')] = p[x>-^{Xx.P, p')] 

We first define a simulation lemma for call-by-value: 

Lemma 6. If we, using the call-by-value inference rules, ean derive p'^ N ^ 
{Xy.W, p') then we ean also derive 'p\- N ^ {Xy.W, p'). 

which we prove in figure 2. 

Again, we assign different costs to the rules to obtain linear-time self- inter- 
pretation results. We start by counting beta-reductions. 

The initial part of the tree uses 3 beta reductions. For the remainder we use 
a lemma like the one for call-by-name: 

Lemma 7. If call-by-value derivation of p N =z {Xy.W, p') uses n beta- 
reductions, then call-by-value derivation of ])P N =b {Xy.W, p') uses at most 
4n + 1 beta-reduetions. 
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We prove lemma 6 by induction over the evaluation tree with N at its root: 

N = x: 

Normal evaluation: p\- x ^ {Xy.W, p') Self-interpretation: ])}- x ^ {Xy.W , p') 

N = Xy.W: p\-N ^ (Xy.W, p) is a leaf tree. N = b {Xy.W), so we get 

p\-b=>{Xz.z,pi) pi[zi-^{Xy.W,p)]\-z ^{Xy.W,p) 

p^b {XyW) ^{XyW,p) 

N = Ni N 2 '. The normal evaluation tree is 

phNi^{Xv.N3,p") 'i^N2^{Xw.N4,p'") p"[v^{Xw.N4,p"')]hN3^{XyW;^ 

p\~ Ni N 2 {Xy.W, p') 

We have N = a Ni N 2 , so we get (by induction) the following tree for self- interpretation 

f 

p\-N 2 =^{Xw.N4, p'") p"[v^{Xw.N4,, p"')]\-N 3 =^{Xy.W, p') 

phaTTiT^^{XyW, 7) 

where (>^) is the tree 

-p'ra^{Xz.z,pi) — V/ -W — ^ Pi[z^{Xv.N 3, p")]'rz^{Xv.N3, p") 

p\-Ni ^{Xv.N3, p") 

~p\-a Ni =^{Xv.N 3 , p") 

Fig. 2. Proof of lemma 6 



We will use the structure of lemma 6 for proving this. 

N = x: Neither normal evaluation nor self-interpretation use beta reductions. 
N = Xy.W: The normal evaluation tree uses 0 reductions while self-interpre- 
tation uses the (BETA) rule once, giving 4 • 0 + 1, as we needed. 

N = N\ N 2 : Assuming the subtrees use k\, k 2 and k^ beta reductions respec- 
tively, the total number of reductions in normal evaluation is k\ A k 2 + k^ + 1. 
By induction, the the subtrees for self-interpretation use at most 4/ci + 1, W 2 + 1 
and 4 /C 3 + I reductions. The tree uses (BETA) twice, so the total reduction count 
is bounded by Ak\ + 4/c2 + 4/cs + 5 = A{k\ + /c 2 + /cs + 1) + 1, which is what we 
want. 

□ 

By adding the cost for the initial states of self- interpretation, we get: 

Theorem 4. If a closed term M evaluates to a WHNF using n call-by-value beta 
reductions, selfint |~M] evaluates to a WHNF using at most 4n + 4 call-by-value 
beta reduetions. 

3.1 Substitution-Based Cost 

Again we want to base the cost of a beta reduction on the size of the function, 
and again we consider a value {Xy.P,p) to represent the term unfold{Xy.P,p). 
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We need a variant of lemma 3, using the new definition of p. We do not get 
equality, as we did in lemma 3, as some terms T may be replaced by (/ T). We 
define P ^ Q to mean that some subterms T in P may be replaced by (/ T) 
in Q and use this in the definition of the new lemma. Note that size of P is no 
larger than the size of Q. 

Lemma 8. unfold ( Ay. P, p) N Ay.unfold(P, /o)[a\/][6\/], 

where P ^ Q means that some subterms T in P may be replaced by (/ T) in Q. 
Hence, the size of P is no larger than the size of Q. We prove lemma 8 similarly 
to the way we proved lemma 3: 
unfold{\y.P, []) = Ay.P: 

unfold{\y.P,^ 

= unfold{Xy.P,p 3 ) 

= unfold{Xy .P , p 2 )[b\unfold{I , pi)] 

= unfold{Xy .P , p 2 )[b\I] 

= unfold{Xy .P[a\unfold{l , pi)], [])[6\/] 

= Ay.P[a\J][6\J] 

unfold{Xy.P, p[x>-^{Xz.Q, p')]) = unfold{Xy .P, p)[x\unfold{Xz .Q , p")]: 

unfold{Xy.P, p[xi—^{Xz.Q, p')]) 

= unfold{Xy.P,p[x>-^{Xz.Q,p')]) 

by definition of p 

= unfold{Xy.P, p) [x\unfold{Xz .Q , p')] 

by definition of unfold 

N Xy .unfold{P, p)[a\I][b\I][x\Xz .unfold{Q , rh.o')[a\/][6\/]] 

by induction 

= Xy.unfold{P, p)[x\Xz ,unfold{Q , rho')][a\/][6\J] 

N Xy.unfold{P, p)[x\unfold{Xy .Q , r/jo')][a\/][6\J] 

Since the size of P[a\/][6\J] is strictly less than 4 times the size of P, we see 
tha t |Mn/o/(i(Ay. P, p)| < 

|Ay.wn/oM(P, p)[a\/][6\/]| < 1 + 4|wn/oM(P, p)|. 

□ 

We count the cost of the initial part of the tree to be at most 7\M\ + 8, just 
as for the call-by-name case. For the rest, we use the lemma 

Lemma 9. If call-by-value derivation of p\~ N => (Ay.W, p') has cost c, then 
call-by-value derivation o/phN" (Ay.W, p') has cost at most 5c 4- 2. 

Proof: 

N = x: Both normal evaluation and self-interpretation has cost 0. 

N = Xy.W: The normal evaluation tree has cost 0 while self- interpretation 
uses the (BETA) rule once for the term Xz.z, which has size 2, giving 5 -0-1-2, 
as we needed. 
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N = Ni N 2 '. Assuming the subtrees have costs ki, k 2 and k^ respectively, 
the total cost of normal evaluation is ki + k 2 + ks + s, where s is the size of 
unfold{\v.N^, p”). By induction, the cost for the subtrees for self-interpretation 
are at most 5/ci + 2, 5/c2 + 2 and 5/cs + 2. The tree uses (BETA) twice, once 
for Xz.z, which has size 2 and once for unfold{Xv.N^, p"), which is of size at 
most 4(s — 1), so the total cost is bounded by 5ki + 5k2 + 5k^ + 8 + 4(s — 1) < 
5{ki + k 2 + ks + s) + A — s. Since the smallest possible value for s is 2, we have 
what we want. 

Combined with the initial cost of 7\M\ + 8, we get 

Theorem 5. If a closed term M evaluates by eall-by-value to a WHNF in cost c, 
selfint \M] evaluates by eall-by-value to a WHNF in cost at most 5c-|-7|M| + 10. 



3.2 Environment-Based Cost 

The environment-based cost measure is the same as for call-by-name. The cost 
of the initial section of the tree is 9. For the rest, the lemma 

Lemma 10. If call-by-value derivation of p\~ N => (Ay. IF, p') has cost c, then 
call-by-value derivation o/ph A" (Ay. IF, p') has cost at most 7c. 

is used. We prove this as before 

N = x: Both normal evaluation and self-interpretation has cost 1. 

N = Ay. IF: The normal evaluation tree has cost 1 while self- interpretation 
uses (VAR) twice and the (BETA) rule once for the term Xz.z, where the ex- 
panded environment is of size 2, giving a total cost of 4. This is well below the 
limit. 

N = N\ N 2 - Assuming the subtrees have costs k \ , k 2 and ks respectively, the 
total cost of normal evaluation is + /C2 + /cs + |p"| + 1. By induction, the cost 
for the subtrees for self-interpretation are at most 7ki, 7k2 and 7k^. The tree 
uses (VAR) and (BETA) twice each, the latter once for Xz.z (cost 2) and once 
for {Xv.Ns, p"), which has cost \p"\ + 1 = \p"\ + 3, so the total cost is bounded 
by 7k\ + 7k2 + 7k^ + \p"\ + 7 < 7{k\ +k 2 + k^ + \p"\ + 1), which is what we want. 
Combined with the initial cost of 9, we get 

Theorem 6. If a closed term M evaluates by eall-by-value to a WHNF in cost 
c (using environment-based cost), selfint \M] evaluates by eall-by-value to a 
WHNF in eost at most 7c + 9. 

4 Call-by-Need Reduction 

Describing call-by-need reduction by a set of inference rules is not as easy as for 
call-by-name or call-by-value. Typically, a store is threaded through the evalu- 
ation and used for updating closures. This is, however, rather complex, so we 
use a different approach: We make the semantics nondeterministic by adding an 
alternative application rule to the call-by-value semantics: 
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p\- M {Xx.M', p') p'[x^*]\- M' W 

/ohM N 



(DUMMY) 



The (BETAV) rule from teh call-by-value semantics evaluates the argument, the 
(DUMMY) rule doesn’t but inserts a dummy value • in the environment instead 
of the value of the argument. There is no rule that allows • in computations, so 
choosing the latter application rule will only lead to an answer if the value is 
not needed. 

These rules model both call-by-need, call-by-value and everything in-between. 
We can define a partial order on inference trees for the same expression by saying 
that a tree Ti is less than a tree T 2 if T 2 uses the (BETAV) rule whenever T\ 
does. The least tree in this ordering that computes a non- • result corresponds to 
call-by-need reduction to WHNF. Hence, we have moved parts of the operational 
behaviour of the language to the meta-level of the semantic rules, rather than 
in the rules themselves. 

This characterization of call-by-need may not seem very operational. How- 
ever, a process that builds a mininal evaluation tree may mimic traditional im- 
plementations of call-by-need: When an application is evaluated, the (DUMMY) 
rules is first used. If it later turns out that the argument is in fact needed (when 
a use of a • is attempted), the origin of the • is traced back to the offending 
(DUMMY) rule. This is then forcibly overwritten with a (BETAV) rule and the 
sub-tree for the argument constructed. When this is done, the • is replaced by 
the correct value and computation resumed at the place it was aborted. Hence, 
•’s play the role of suspensions and the replacement of a (DUMMY) rule by a 
(BETAV) rule corresponds to updating the suspension. 

The initial part of self-interpretation for call-by-need is the same as for the 
call-by-value case, except that for simple terms, the variables a or 6 may not 
be needed and can hence be bound to • and the corresponding evaluations of 
their closures not occur. However, the cost of the initial portion will (by any 
reasonable cost measure) be no more than the cost of the call-by-value tree. We 
will use the same initial environments as for the call-by-value case, but extend 
the definition of p to handle variables that are bound to •. 



D =P3 

/o[a;i— > (Ax.H, p')] = p[a;i— > (Ax.P, p')] 
p[x^*\ =])[x^*\ 

Like in the previous cases, we define a call-by-need simulation lemma: 



Lemma 11. If we using the call-by-need inference rules can derive p^ N ^ 
(Ay. IT, p') then we can also derive 'p\- N (Ay. IT, p'). 

Which we prove in figure 3. 

Since lemma 11 includes the cases where variables in the environment are 
bound to •, we conclude that, if normal evaluation does not need the value of a 
variable, then neither does the self-interpreter. 
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We prove lemma 11 by induction over the evaluation tree with N at its root. Only the 
case for the (DUMMY) rule differs from the proof of lemma 6, so we omit the rest. 

N = Ni N 2 '. Using the (DUMMY) rule, the normal evaluation tree is 



p\-Ni =► (Ari.A's, p") p" •] h (Aj/.IV, p') 
p\~Ni N 2 {Xy.W, p') 

Which (by induction) leads us to the following self-interpretation tree 

(*) 



p"[vi-^»]\- N 3 =^(Aj/.lU, p') 

p\-a7u7h ^{Xy.W, (7) 



where (>^) is the tree 
'p'ra [Xz.z, pi) 



p\-Ni ^{Xv.Ns, p") 



pi[zi-^{Xv.N 3 , p")]\-z ^{Xv.N 3 , p") 



p\- a Ni =>{Xv.N3, p") 
Fig. 3. Proof of lemma 11 



□ 



We will in the proofs of linear-time self-interpretation also refer to the proofs 
for the call-by-value case except for the (DUMMY) case, as we use the same cost 
measures and the same constant factors. 

We start by counting beta reductions. Our theorem is 

Theorem 7. If a closed term M via the eall-by-need semanties evaluates to a 
WHNF using n call-by-need beta reductions, selfint \M) evaluates to a WHNF 
using at most 4n + 4 call-by-need beta reductions. 

The corresponding lemma proves simulation using 4n + 1 steps, after the 
initial portion. We use the proof for lemma 7 with the addition of a case for the 
(DUMMY) rule: Normal evaluation uses ki + I- 1 beta reductions, where ki 
and ks are the numbers of beta reductions required for Ni and N 3 . By induction, 
interpreting Ni and N 3 costs at most 4/ci + 1 and 4/c2 + 1. Additionally, 2 beta 
reductions are used, so the total cost is bounded by 4(/ci + /c 2 + 1), which is one 
less than our limit. 

We can now go on to substitution-based cost. We assign the same cost to the 
(DUMMY) rule as to the (BETAV) rule: The size of the extended environment. 
We extend the definition of unfold to handle •: 

unfold{F, p[x 1 -^ •]) = unfold{F, p) [a;\d] 

where d is a free variable that does not occur anywhere else. It is easy to see that 
the same size limit as before applies: \unfold{Xy .P ,'p)\ < A\unfold{P, p)\. Hence, 
we shall go directly to the theorem 

Theorem 8. If a closed term M evaluates by eall-by-need to a WHNF in cost c, 
selfint \M] evaluates by eall-by-need to a WHNF in cost at most 5c+7|M| + 10. 
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Again, we only state the case for the (DUMMY) rule and refer to lemma 9 
for the rest: If normal evaluation has cost ki and for evaluation of Ni and 
N 3 , the total cost is ki+k 2 + s, where s is the size of unfold{\v.N^, p”). For self- 
interpretation, interpretation of N\ and have by induction costs bounded by 
^k\ + 2 and bk^ + 2. Additionally, we use (BETAV) once at cost 2 and (DUMMY) 
once at cost \unfold{Xv.N 3 , p")\ < 4 :\unfold{N 3 , p")\ = 4(s— 1). This gives a total 
cost bounded by 5{ki + /C2 + s) — s + 2, which is well within our limit. 

Environment-based cost is no bigger problem: 

Theorem 9. If a closed term M evaluates by call-by-need to a WHNF in cost c 
(using environment-based cost), selfint \M] evaluates by call-by-need to a WHNF 
in cost at most 7c + 9. 

Again, we refer to the proof for the call-by- value case except for an additional 
case for the proof of lemma 10 to handle the (DUMMY) rule: 

Normal evaluation uses the (DUMMY) rule at cost \p"\ + 1 plus the costs of 
evaluating N\ and N^, which we set at k\ and k 2 - Self-interpretation uses at most 
7ki and 7k^ to interpret A^i and N 3 . To this we add two uses of (VARY), one 
use of (BETAV) at cost 2 and the use of (DUMMY) at cost \p"\ + 1 = \p"\ + 3. 
This adds up to 7{ki + k^ + \p"\ + 1) — 6|/o"|, which is within our budget. 

5 Conclusion and Future Work 

We have proven that a simple self-interpreter for the pure lambda calculus can 
do self-interpretation in linear time, i.e. constant overhead. We proved this for 
reduction to weak head normal form using call-by-name, call-by-value and call- 
by-need using three different cost measures. 

It would be interesting to extend the present work to include studies of self- 
interpretation cost for reduction to head normal form and full normal form. 
The author expects these to have linear-time self-interpretation too, but is not 
currently working on proving this. 

Apart from being interesting in its own right, the result is a step towards 
proving the existence of a linear-time complexity hierarchy for the pure lambda 
calculus, along the lines of Jones’ result for first-order functional and imperative 
languages [2]. The proof involves a self- interpreter that not only has constant 
overhead but also counts the amount of time (by some cost measure) it uses. If 
it can not finish within a set budget of time, the self-interpreter stops with a 
special error-value. This self-interpreter is then used in a diagonalization proof 
reminiscent of the classical halting-problem proof to show that a certain problem 
can be solved in time o{kn) but not in time o(n), where k is the interpretation 
overhead. 

We are currently working on this and have sketched a proof for call- by-name 
reduction to WHNF. However, due to the resource counting the proof is about 
an order of magnitude harder than the proofs shown in this paper, so we are 
investigating ways to simplify the proofs. 
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This study has some relation to the work by Rose [7] on showing that there ex- 
ist a linear-time hierarchy for CAM, an abstract machine used for implementing 
higher-order functional languages. This was proven by showing linear-time inter- 
pretations between CAM and the language used in Jones’ paper. This method 
does not carry over to the lambda calculus, as such interpretations are not likely 
to exist, at least not for natural complexity measures for reduction in the lambda 
calculus. 

Rose [6] goes on to attempt to characterize neccesary conditions for the 
existence of a linear-time hierarchy. It is stated that for a language to sup- 
port a linear-time hierarchy, it may not allow constant-time access to a non- 
constant number of variables, locations, symbols or functions, where the con- 
stant is uniform over all programs. This would indicate that any cost measure 
for the lambda calculus that allow constant time access to variables (e.g. count- 
ing beta-reductions) contradics the existence of a linear-time hierarchy. However, 
the proof sketch mentioned above indicates that one such actually does exist. 
We will look further into this apparent contradiction in future work. 

In [4], a different representation of lambda terms was used. It was based 
on higher-order abstract syntax, but used a standard-style representation where 
recursion over the syntax is not encoded in the term itself. Hence, the self- 
interpreter needed to use an explicitly coded fixed-point combinator, making it 
somewhat more complex than the one used in this paper. Redoing the proofs in 
this paper for that self-interpreter will be much more work due to the larger size, 
but the same principles should apply and we expect a (much larger) constant 
overhead for this case as well. 

The use of a nondeterministic operational semantics to encode call-by-need 
reduction made the proofs for this very simple. In our knowledge, this technique 
hasn’t been used earlier, though a similar notion (repacing a term by •) has 
been used to define neededness [1]. We expect it to be useful for proving other 
properties about call-by-need reduction. 

Our discussion of different cost measures may seem similar to the discussions 
by e.g. Lawall and Mairson [3] on cost models for the lambda calculus. How- 
ever, these models are meant to be independent of any particular implementa- 
tion whereas the measures presented here try to mimic specific implementation 
methods. 
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1 Regular Schemes 

In the following, we assume that sets of variables, basic statements, and selectors 
are given. Let us choose subsets of arguments ^(s), results R{s), and obligatory 
results R'{s) C R{s) in the set of variables for each basic statement s. For 
each selector c, we choose the set of its arguments A{c). The sets of results and 
obligatory results for selectors are considered to be empty. In addition, an arity 
ar(c) e N is assigned to each selector c. 

Regular schemes (hereafter, schemes) are directed ordered labeled graphs of 
a special kind. The set of schemes can be inductively described as follows. 

1. The graph with the empty set of nodes and arcs is a scheme: this is an 
empty scheme. For any nonempty scheme S, we will indicate two distinguished 
nodes — the input and the output. 

2. A graph without arcs with the single node v labeled by a basic statement 
s is a scheme: this scheme corresponds to the basic statement s and has node v 
as its input and output. 

3. Let S\ and S 2 be nonempty schemes. Connect the output of S\ with the 
input of S 2 by a new arc. Let the input of S\ be the input of graph S constructed 
in this way, and the output of S 2 be its output. Extend the order of S by the 
relation of the new arc with itself. Then, the graph S' is a scheme: we will say 
that S is obtained from Si and S 2 by the series union and write it as S = Si 0 S 2 . 

4. Let i? be a scheme, c be a selector, and ar(c) = 2. Consider two new nodes 
V and w — the input and the output of a new scheme S, respectively. Let us 
label w by the selector c and connect w and u by a new arc. Then, we act as 
follows. If B is nonempty, we connect v with the input of i? by a new arc, and 
the output of B with w, by another new arc. If B is empty, we connect v and 
w hy a, new arc. For each new arc, we extend the order of S by the relation of 
this arc with itself. Graph S, constructed as described above, is a scheme: we 
will say that S is the loop with the body B and the condition c. 

5. Let Bi, . . ., Bn be regular schemes, c be a selector, and ar(c) = n. Con- 
sider two new nodes v and w — the input and the output of a new scheme S, 
respectively. Let us label v by the selector c. Then, we act as follows. For each 
nonempty scheme Bi, we connect v with the input of Bi by a new arc, and the 



D. Bj0rner, M. Broy, A. Zamulin (Eds.): PSI’99, LNCS 1755, pp. 143—148, 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 




144 



Denis L. Uvarov 



output of Bi with w, by another new arc. For each empty scheme Bi, we connect 
V and tc by a new arc. For each pair of new arcs with the common beginning 
and belonging to schemes Bi and Bj , respectively, extend the order of S by the 
relation between these arcs if f < j. Graph S constructed as described above is a 
scheme: we will say that S is the hammock with the selector c and the branches 
Bi, . . ., Bn. 

A scheme is called a component if it is nonempty and cannot be represented 
in a form of the series union of two nonempty schemes. 

We now define the sets of arguments A(S'), results R{S), and obligatory 
results R'{S) for scheme S. A path in a regular scheme is a sequence v\e\ . . . 
Vn-iS-n-iVn, Consisting of nodes v\, . . .,Vn and arcs ei, . . ., e„_i in which the arc 
Ci leads from Vi to An execution chain in a regular scheme is a sequence of 
labels written down when going along a path from the input to the output. For 
an execution chain a, let us set A{a) = R{a) = R'{a) = 0 , if a is empty, and 
A{a) = A(o;i)U(A(Q;2)\i?'(Q;i)), R(a) = i?(o;i)Ui?(Q;2), R\a) = R'{ai)UR\a 2 ), 
if a can be represented as a concatenation of subchains a\ and 02 at least one 
of which is nonempty. 

If S is empty, we set A(S') = R{S) = R'{S) = 0 . Now assume that S is 
nonempty. Let EC{H) be the set of all execution chains of S. Then, A{S) = 
UaeEC(S) R{S) = UccgBCiS) ^ 11^1 R' (S) = HagBCiS) R'{ol)- 

Let T be a nonempty subscheme of S. Let us denote by S\T/T'] a scheme that 
is obtained from S' as a result of the change ofT for T' . If T' is the empty scheme, 
we will say that S(^t) = S[T/T'] is obtained from S by deleting subscheme T. 

A memory state is either determined by the set of values of all variables 
or is an invalid state. An interpretation assigns a function of transformation 
of memory states to each basic statement; interpretation also assigns to each 
selector a function that generates a number of the branch to be chosen, depending 
on the memory state, or generates an error message. Once an interpretation is 
specified, it can be extended on the set of all regular schemes in a natural way. 
Two schemes are called equivalent if any interpretation assigns them identical 
functions. 

A regular scheme S is called a pseudoscheme if the parent scheme par{S) and 
the sets of candidates for deletion after the removal up or down, u-dels{S) and 
d-dels{S), are specified. 

An execution chain a is called irredundant if it cannot be represented in the 
form of the concatenation of three subchains a = q;iq; 2Q;3, in such a way that 
t G R{a2), t G R'{a3), t ^ A{a2), and t ^ ^(03) for some variable t. A scheme S 
is irredundant if, for any of its arc, there exists an irredundant execution chain 
obtained from a path involving this arc. If a scheme is irredundant, any of its 
subscheme is also irredundant. 

2 Removal from Loops and Hammocks 

Let L be a subloop of a scheme S with a body B = X^o X o X and a condition 
c, where X_, X, X be regular schemes, and X be a component. If the conditions 
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of removal up from L _A{X) n R{X) = R{X)nA{X) ^R{X) n R{X) n {A{X) U 
(A(c)_\i?'(X))) = R{X)riR{X)\R'{X)_= A{X)r\R{XoXoX) = 0, md either 
(A(X o X)_U U(c) \ R'^ oX)))n R{X o X o X) = 0 or R{X) n R”{X o X) = 
i?(X)ni?'(XoX)nU(^o^)U(4l(c)\i?'(XoX))) = 0, hold for X, the removal 
up of X from L consists in the substitution of scheme X o L(x) for subloop L. If 
the conditions of removal down from L A{X) n i?(X) = R{X) n {A{2Q U (A{c) \ 
R'iK))) = i?(X)_n R{X) = R{X) n (^X o X) U (A(c) \_R'(X O X))) = 0, and 
either A(X)ni?(XoXoX) = 0 or (A(XoX)U(A(c)\i?'(XoX)))ni?(XoX) = 
A{X) n R{X) \ R'{X o X) = 0, hold for X,. the removal down of X from L 
consists in the substitution of scheme ° ^ for subloop L. 

Let X and Y be linear subcomponents of B. We say that the removal up 
of Y depends on the removal up of X and write X — F, if X is arranged 
before F in i?, F is not a pseudoscheme with parfY) = X, and one of the 
intersections A{X) n R(Y), R{X) n A(Y), or i?(X) n R(Y) is not empty, or 
X = F and A{Y) n R{Y) 0, or X is arranged after F in i? and one of 
the intersections A{Y) n R{X), or R(Y) n R{X) is not empty. We say that the 
removal down of F depends on the removal down of X and write X Y, if one 
of the following conditions X is arranged after F in i?, F is not a pseudoscheme 
with pariY) = X, and one of the intersections A{Y) n R{X), R{Y) n A{X), or 
R{Y) n R{X) is not empty, or X = F and A{Y) n R{Y) 0, or X is arranged 
before F in i? and one of the intersections R{X) n A{Y), A{X) n R{Y), or 
R'{X) n R{Y) is not empty. We say that the removal down of F depends on the 
selector c and write c F, if A{c) D R{Y) yf 0. 

We define a removal dependency graph A{L) as follows. The set of nodes 
consists of the selector c and all the linear subcomponents of B. The set of arcs 
is divided in two nonoverlapping sets of u-arcs and d-arcs, in such a way that 
u-arc e connects nodes v and w if and only if v — w, and d-arc e connects 
nodes v and w if and only if v -^d w- Let deg:f{v) be the number of u-arcs with 
the end node v, and deg^{v) be the number of d-arcs with the end node v. 

Let id be a hammock with a selector c and branches Bi, . . ., Bn- A can- 
didate chain for removal up from id is a sequence X = Xi, . . ., X„ of mutu- 
ally isomorphic schemes, such that Xi is a linear subcomponent of Bi and the 
number of linear subcomponents of Bi that are isomorphic to Xi and are ar- 
ranged in Bi before Xi is the same for all i. Similarly, a candidate chain for 
removal down from id is a sequence X = Xi, . . ., X„ of mutually isomorphic 
schemes, such that Xi is a linear subcomponent of Bi and the number of lin- 
ear subcomponents of Bi that are isomorphic to Xi and are arranged in Bi 
after Xi is the same for all i. Let us designate any of the schemes Xi, . . ., 
Xn as Comm{X) and the hammock obtained from id by deleting all schemes 
Xi, . . ., Xn as H(x)- Let X be a candidate chain for removal (up or down), 
and Xi, Xi, i = 1 ... n be the subschemes of id such that Bi = Xi o Xi o Xi 
for all i. If X is a candidate for removal up and the conditions of removal up 
from id (A(c) u A(w)) n R{Xi) = R(Ti) n A{Xi) = i?(W) n R{Xi) n A(W) = 
R{Xi) n R{Xi) \ R'jXj) = 0 for all i, hold for X, the removal up of chain X 
from id consists in the substitution of scheme Comm{X) o id(x) for hammock 
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H. Similarly, if X is a candidate for removal down and the conditions of removal 
down from H A{Xi) n R{^) = R{Xi) n A{^) = R{Xi) n R{^) = 0 for all i, 
hold for X, the removal down of chain X from H consists in the substitution of 
scheme H{x) o Comm{X) for hammock H. 

Let Xi be a linear subcomponent of F be a candidate for removal up 
from H. We say that the removal up of Y depends on Xi and write Xi Y , if 

Xi is arranged before in is not a pseudoscheme with par (Vi) = Xi, and 

one of the intersections A(Yi) n R{Xi), R(Yi) n A{Xi), or R{Yi) n R{Xi) is not 
empty. Let X be a candidate for removal up from H . We say that the removal 
up of Y depends on the removal up of X and write X Y , if Xi Y for all 
i. We say that the removal up of Y depends on the selector c and write c — Y, 
if A{c) n Comm(Y)) ^ 0. Similarly, let Xi be a linear subcomponent of Bi, Y 
be a candidate for removal down from B. We say that the removal down of Y 
depends on the removal down of Xi and write Xi F, if X^ is arranged after Yi 
in Bi, Yi is not a pseudoscheme with par(Yi) = Xi, and one of the intersections 
A(Yi)r\R{Xi), R(Yi)r\A{Xi), or R(Yi)r\R{Xi). Let X be a candidate for removal 
down from H . We say that the removal down of F depends on the removal down 
of X and write X Y , if Xj -^d Y for all i. 

We define a removal dependency graph A{H) as follows. The set of nodes 
consists of the selector c, candidate chains for removal up or down from H, 
and all the linear subcomponents of B\, . . ., Bn that do not belong to any of 
candidate chains. The set of arcs is divided into two nonoverlapping sets of u- 
arcs and d-arcs in such a way that u-arc e connects nodes v and w if and only if 
V w, and d-arc e connects nodes v and w if and only if v -^d w. Let de 5 +(u) 
be the number of u-arcs with the end node v, and deg(i (v) be the number of 
d-arcs with the end node v. 

If X = Xi, . . ., Xn is a candidate chain for removal up or down, then a 
candidate chain X for removal in opposite direction is called dual to X if X^ = Xi 
for some i. 



3 The Purging Algorithm 

Algorithm. 

Input: a regular scheme S. 

Output: a regular scheme S'. 

First, procedure 1 is applied to the input scheme S. It constructs a scheme 
S' and the set U{S'). Then, each pseudoscheme T G 17(5") is transformed to an 
ordinary scheme (additional information is deleted) and all subschemes belonging 
to the set u-dels{T) are deleted from S' in the process. 

Procedure 1. 

Input: a regular scheme 5. 

Outputs: a scheme S' and the set II {S'). 

1. If 5 is an empty scheme, or 5 corresponds to a basic statement, then return 
5 and the set 7f(5) = 0. Otherwise, go to step 2. 
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2. If S' is a loop, then go to step 3. If S is a hammock, then go to step 4. 

Otherwise, let S = Sio. . .oSn, n > 2. The procedure 1 is then applied to schemes 
Si, . . ., Sn- Let S{, n{S{), . . ., S^, respectively, be the outputs obtained. 

Then, return the scheme S' = S[ o ... o S!^ and the set fl(S') = Uig[i:n] 

3. Let S be a loop with a body B and a condition c. The procedure 1 is then 
applied to the scheme B. Let B' and U{B') be the outputs obtained, and S' be 
the loop with the body B' and the condition c. The procedure 2 is then applied 
to the loop S' and to the set LI(S') = U{B'). Let S" be the output obtained. 
Then, return S" and the set iT(S") = 0. 

4. Let S be a hammock with a selector c and branches Bi, ..., B^,. The 

procedure 1 is then applied to schemes Bi, . . ., B^. Let B'^, 1J{B'^), . . ., B'^, 
n{B'^), respectively, be the outputs obtained, and S' be the hammock with the 
selector c and branches B'^, . . ., B'^. The procedure 3 is then applied to the 
hammock S' and to the set = Uig[i:n] ^{B'^). Let S" and I7(S") be the 

outputs obtained. Then, return S" and the set I7(S"). 

Procedure 2. 

Inputs: a loop L and the set H{L). 

Output: a regular scheme L'. 

1. Construct the graph A{L). Then, transform each pseudoscheme T G n{L) 
to an ordinary scheme and delete from S' all subschemes that belong to set 
d-dels(T). Set M = N = e. Then, go to step 2. 

2. If there exists at least one scheme X such that deg^{X) = 0, select this 
scheme and go to step 3. Otherwise, go to step 4. 

3. Delete the node X and all arcs that begin in X from A{L). Then, delete 
X from L, set N = X o N , and go to step 2. 

4. If there exists at least one scheme X such that deg^{X) = 0, select this 
scheme and go to step 5. Otherwise, go to step 6. 

5. Delete from A{L) the node X and all arcs that begin in X. Then, delete 
X from L, set M = M o X, and go to step 2. 

6. Recalculate A{L), R{L), and R'{L). Then, return the scheme M o L o N. 

Procedure 3. 

Inputs: a hammock H and the set II (H). 

Outputs: a scheme H' and the set U{H'). 

1. Construct the set <P of all the candidate chains for removal up from H and 
the set W, of all the candidate chains for removal down from H. Construct the 
graph A{H) and set T = G = e. Then go to step 2. 

2. If there exists at least one chain X E such that deg^ (X) = 0, select this 
chain, set I' = I' \ {X}, and go to step 3. Otherwise, go to step 6. 

3. Delete from A{H) all arcs that begin from X. Set G = Comm{X) o G. If 
a dual chain for X exists, then go to step 4. Otherwise, go to step 5. 

4. 12 = 12 U {(T, X)}. Go to step 2. 

5. For each element Xi that is a pseudoscheme, delete from H all the schemes 
that belong to set d-dels{Xi). Then, delete all elements of X from H. Return to 
step 2. 
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6. If there exists at least one chain X E <P such that deg^{X) = 0, select this 

chain, set \ {X}, and go to step 7. Otherwise, go to step 10. 

7. Delete from A{H) all the arcs that begin from X . If X has no dual chain 
or if there is no pair of the (T, X) form in set J7, then go to step 8. Otherwise, 
go to step 9. 

8. For each element Xi that is a pseudoscheme, delete from H all the schemes 
that belong to set u-dels{Xi). Then, delete all elements of X from H. Set F = 
F o Comm{X) and return to step 6. 

9. Let (T, X) G f2. From T, construct a pseudoscheme for which par{T) = H, 
f-dels{T) = f-dels{Xi), and b-dels{T) = {J^^^b-dels{Xi) (for the sake of 
convenience, we set f-dels{Xi) = b-dels{Xi) = {Xi\ if scheme Xi is not a 
pseudoscheme). Add T to set II and delete from H all the elements of X and 
X that are pseudoschemes. Go to step 6. 

10. Recalculate A{H), R{H), and R'{H). In the process, we should take 
into account only those deletions that were really performed (ordinary schemes 
are not deleted when pseudoschemes are constructed). Then, return the scheme 
H' = FoHoG and the set n{H') = H. 

Theorem. The scheme that is obtained from the input scheme as a result 
of applying the algorithm described above can be obtained by applying trans- 
formations of removal from sub loops and subhammocks. 

The number of the nodes of a regular scheme S will be called its size and 
denoted by [S']. The depth d{S) of a scheme S is the maximum length n of the 
sequences of the components Ti, . . .,Tn such that R is a proper subcomponent 
of Ci+i for alH = 1, . . ., n — 1. If there are no such sequences, we set d{S) = 1. 

Theorem. The algorithm described above requires the time 
0{d{S)\S\‘^time{m)) and the storage 0(|5'p + space{m)) to work with a scheme 
S. Here m is the number of variables, time{m) is the upper bound of the time 
required for one operation (n, U, or \) over subsets of the set of variables, and 
space{m) is the upper bound of the memory needed to store one subset of the 
set of variables. 

Theorem. Let S be an irredundant scheme without degenerate subloops, 
and let S' be a scheme obtained from S by applying the algorithm described 
above. Let S" be any scheme obtained from S by applying transformations of 
removal from subloops and subhammocks. Then, the following statements are 
true. 

1. Let I' be a number of transformations that must be applied to obtain S' 
from S, and let I" be a number of transformations that must be applied to obtain 
S" from S. Then, I' > I". 

2. \S'\ < |5"|. 

References 

[1] Pottosin, I.V., Justification of Algorithms for Optimization of Programs, Program- 

mirovanie, 1979, no. 2, pp. 3 — 13. 

[2] Pottosin, I.V. and Yugrinova, O.V., Justification of Purging Transformations for 

Loops, Programmirovanie, 1980, no. 5, pp.8 — 16. 




Polymorphism in OBJ— P 



Martin Pliimicke 



Wilhelm-Schickard-Institut, Universitat Tubingen, Sand 13 
D-72076 Tubingen, Fax.: +49 7071 610399 
pluemickQinf ormatik . uni-tuebingen . de 



Abstract. In this paper we present the functional programming lan- 
guage OBJ-P. OBJ-P is a polymorphic extension of OBJ-3. The main 
features are overloaded function symbols, set inclusion subtyping, and 
parametric polymorphic types. 



Introduction 

The functional programming language OBJ-3 (e.g. [GWM+93]) has two main 
features: overloading of function symbols and subtyping in the sense of set inclu- 
sion. There is also a powerful module system for OBJ-3. In OBJ-3 parameterized 
types are missing. There is sure the possibility to parameterize whole modules. 
But this is a not good solution, because if one type within a module has a 
parameter the whole module must be parameterized. 

We extend OBJ-3 by parametric polymorphic types, which are well-known from 
SML [Mil97]. We call this extension OBJ-P. This means that OBJ-P allows 
both, parameterized types and parameterized modules. The combination leads 
to a powerful language which has enormous possibilities to reuse code and to 
overload function symbols; however, function evaluation remains unambiguous. 
Our overloading feature is more expressive than overloading in Haskell [PH+97]. 
These features (overloading and set inclusion subtyping) are very interesting in 
computer algebra, as in mathematics we deal with overloaded function names 
and with set hierarchies. 

The semantic base of OBJ-3 is the theory of order-sorted algebras (e.g. [GM89]). 
Therefore, we generalize the theory to polymorphic order-sorted algebras. For 
our theory we extend the theory of Smolka [Smo88] . 

1 The Functional Programming Language OBJ— P 

The types of OBJ-P programs are sorted type terms Tq{TV ) over a finite rank 
alphabet 6> of type constructors and a set of type variables TV. 

Definition 1. (Type term ordering) Let Te{TV ) be a set of type terms. 
Then < is a type term ordering if < is finite, from 6 <0' follows TVar( 6 ) = 
TVar( <!' ) (where TVar denotes the occurring type variables), and if 0<6' then 
6 is no type variable and 6' has the form !?'( oi, . . . , Un ), where G and 
Oi, . . . G TV . The ordering of the type terms in an OBJ-P program is defined 
as the closure <* of a declared type term ordering < wrt. substitutions. 
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The function symbols of a program form a polymorphic order-sorted signature. 

Definition 2. (Polymorphic order-sorted signatnre) A polymorphic order- 
sorted signature Uos is a triple (T© ( TV ) , <, F) where Tq ( TV ) is a set of type 
terms and F is a (Tq^TV )* x Tq{TV ) (-sorted family of function symbols, and 
< is a type term ordering such that the function symbols satisfy the monotonicity 
condition: f E and (j{0i)<* for all 1 implies 

[GM89] gives a regularity condition for order-sorted signatures. This condition 
guarantees that each term over a regular order-sorted signature has a least type. 
We generalized this regularity condition for polymorphic order-sorted signatures 
such that each term has a least principal type. [Plu99] 

The semantics of OBJ-P programs is defined as polymorphic order-sorted alge- 
bras [Plii99], which is declared in OBJ-P programs by recursive equations over 
the signature. 

Example 1. The OBJ-P program MATRIX declares a type term ordering, a poly- 
morphic order-sorted signature, and a polymorphic order-sorted algebra. 



obj MATRIX is 

*** Type term ordering declaration 
sorts Int nsVector(a) Vector(a) Matrix(a) . 
subsort nsVector(a) < Vector (a) . 
subsort Vector (nsVector(a) ) < Matrix(a) . 

*** Signature declaration 
op scalar : a -> Vector(a) . 
op vec : a Vector(a) -> nsVector(a) . 
op + ; Int Int -> Int . 

op + ; Vector(Int) Vector(Int) -> Vector(Int) . 
op + ; Matrix(Int) Matrix(Int) -> Matrix(Int) . 

*** Equation declaration 

vars si s2 : Int . 

vars vl v2 : Vector (Int) . 

eq +(scalar(sl) , scalar(s2)) = scalar(+(sl, s2)) . 
eq +(vec(sl, vl) , vec(s2, v2)) = vec(+(sl, s2) , +(vl, v2)) . 
vars vsl vs2 : nsVector (Int) . 
vars vvl vv2 : Vector (nsVector(Int) ) . 
eq +(scalar(vsl) , scalar(vs2)) = scalar(+(vsl , vs2)) . 
eq +(vec(vsl, vvl), vec(vs2, vv2)) = vec(+(vsl, vs2) , +(vvl, vv2)) . 
endo 

The basic part of the declared infinite type term ordering is presented by follow- 
ing Hasse diagram: 



Vector (x) 



Matrix (x) Vector (Vector (x) ) 




Vector (nsVector (x) ) nsVector (Vector (x) ) 




nsVector (nsVector (x) ) 



nsVector (x) 
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2 The Module System of OBJ— P 

We present the powerful module system with an interesting example, which 
shows the enormous possibilities of overloading and set inclusion sub typing in 
connection with module hierarchies. 

The module hierarchy presented in this example describes the sum of polynomials 
over rings. The parameters of this module are the ring over which the polynomials 
are defined and the sum over the ring elements, respectively. The appendant 
product function is presented in [Plii99]. 

The function symbol + is overloaded with the sum over the ring elements, the 
monomials, and the polynomials over the ring, respectively. There is a predefined 
module Int, which exports usual functions about numbers. 

The Recursive Representation of Polynomials In the SACLIB [BCE+92] 
computer algebra library, polynomials are described in the recursive representa- 
tion. 

The OBJ-P sorts declaration in the OBJ-P module Polynom represents recursive 
polynomials over any ring. 

module Polynom (nmPolynom(Ring) , nrMonom(Ring) , 

mono; Ring nzCard -> Monom(Ring) , *** exports 

poly; Monom(Ring) Polynom -> nmPolynom(Ring) ) 

[sorts Ring, Polynom] is *** parameters 

import Int . 

sorts Monom(a) nmPolynom(a) . 

subsorts Ring Monom(Ring) nmPolynom(Ring) < Polynom . 
op mono; Ring nzCard -> Monom(Ring) . 
op poly; Monom(Ring) Polynom -> nmPolynom(Ring) . 
endm 

In the module two type constructors Monom and nmPolynom are declared. The type 
Monom(Ring) stands for the set of monomials with exponents greater than 0 and 
nmPolynom (Ring) stands for the general polynomials (the non- monomial poly- 
nomials). Both types are parameterized by the module parameter Ring which 
stands for the type of the ring elements. The other module parameter is Polynom. 
It stands for the type of the union of all ring elements (Ring), all Monomi- 
als (Monon(Ring) ), and all non- monomial polynomials (nmPolynom (Ring) ). The 
exported sorts are nmPolynom (Ring) and Momon(Ring), while the exported con- 
structors (function symbols) are mono and poly. 

A polynomial in this representation consists of a list of monomials with the list 
constructor poly. In this representation the names of the variables are unknown. 
Now we give an example. Let us consider a polynomial in two variables: 

X 2 T (3s^i + ‘2x\ + 1 ) 2^2 + {xi + xi -f l)x2 + {xi J- xi + 12). 

For polynomials in two variables we must import the module Polynom twice: 

import Polynom [Ring = Int, Polynom = Polynoml] . 
import Polynom [Ring = Polynoml, Polynom = Polynom2] . 
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where Polynoml (stands for 'L[x{\) and Polynom2 (stands for Z[a:i][a; 2 ]) are new 
sorts, which are instantiated in the imported modules. Then, the above polyno- 
mial is represented by: 

poly (mono (mono (1 , 0), 3), 

poly(mono(poly(mono(3, 3), poly(mono(2, 1), 1)), 2), 

poly (mono (poly (mono (1 , 2), poly(mono(l, 1), 1)), 1), 
poly(mono(l, 2), poly(mono(l, 1), 12))))) 

It is an element of the ring Z[a;i][a; 2 ]- Therefore, the type is Polynom2. 

If we have a closer look at the representation (poly(mono(l, 2), poly(mono(l, 
1) , 12))) of the coefficient (xf + Xi + 12) (last line) we notice that its type is 
Polynoml instead of Polynom2. This is possible as the type Polynoml is a subtype of 
Polynom2 induced by the subsort declaration Ring < Polynom in module Polynom 
through the second import. On the other hand, the type of the monomial 
(represented by mono(mono(l, 0), 3)) is Monom(Monom(lnt) ) and not Monom(lnt) 
as Z[a: 2 ] is not a subset of Z[a;i][a; 2 ]. 

The main difference to the usual recursive representation of polynomials is the 
following: A polynomial p G . . . [xn-i] is usually represented in . . . [xn] 
as p ■ x^ and not as p like in our representation. This is only possible, because 
OBJ-P allows set inclusion subtyping and multiplied importation of one module 
with different parameter instantiations. 

Sum of Polynomials 

module PolynomSUM (+; Polynom Polynom -> Polynom) *** exports 

[sorts Ring Polynom, op +: Ring Ring -> Ring] is 
import Int . *** parameters 

import Polynom [Ring = Ring, Polynom = Polynom] . *** import where 

*** the parameters 

op +; Polynom Polynom -> Polynom . *** are instatiated 

vars coel coe2 re : Ring . 
vars expl exp2 : nzCard . 
vars pi p2 : Polynom . 
var m : Monom(Ring) . 

eq +(mono(coel, expl), re) = poly (mono (coel, expl), re) . 
eq +(poly(m, pi), re) = poly(m, +(pl, re)) . 
eq +(mono(coel, expl), mono(coe2, exp2)) = ... 
eq +(poly (mono (coel , expl), pi), mono(coe2, exp2)) = 
if (expl > exp2) then poly (mono (coel , expl), +(pl, mono(coe2, exp2))) 
else 

if (exp2 > expl) then poly (mono (coe2, exp2) , poly (mono (coel , expl) , pi)) 
else poly (mono (+ (coel , coe2) , expl), pi) fi fi . 

eq +(poly (mono (coel , expl), pi), poly(mono(coe2, exp2) , p2)) = ... 
eq +(pl, p2) = +(p2, pi) . 
endm 
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In the module PolynomSUM there is a function call named by the overloaded func- 
tion symbol + (underlined). This function call produces either a recursive call 
(depended on the argument type) or the call of the module parameter + function. 
This is an example for the overloading feature in OBJ-P. 

The module PolynomSUM is a parameterized module similar to the module Polynom. 
Now we present different possibilities to import these two modules. 

Polynomials over the Integer Numbers 

module Int2PolynomSUM is 
sorts Polynoml Polynom2 . 
import Int . 

import PolynomSUM [Ring = Int, Polynom = Polynoml, 

+ = +: Int Int -> Int] . 

import PolynomSUM [Ring = Polynoml, Polynom = Polynom2, 

+ = +: Polynoml Polynoml -> Polynoml] . 
op Main: Polynom2 Polynom2 -> Polynom2 . 
vars X y : Polynom2 . 
eq MainCx, y) = +(x, y) . 
endm 

The module PolynomSUM is imported twice. While in the first import the param- 
eters of PolynomSUM are instantiated by Int, Polynoml, and +; Int Int -> Int, in 
the second import the parameters Ring and +; Ring Ring -> Ring are instantiated 
by the already imported types Polynoml and +; Polynoml Polynoml -> Polynoml 
and Polynom is instantiated by the additionally declared type polynom2. Finally, 
the function Main define the polynomial sum over the ring Z[a;i][a; 2 ]- 
We notice that the type Polynoml is the union of Int, Monom(Int), and nmPoly- 
nom(Int) while Polynom2 is the union of Polynoml, Monom(Polynoml), and nmPoly- 
nom (Polynoml ) . From this follows that the Main function is enormously over- 
loaded. Main is applicable to integer numbers, to polynomials in the variable 
x\ as well as to polynomials in the variables x\ and X 2 , and to the mixture 
of all these types. This is a very natural way to overload the sum function. In 
languages like Haskell [PH+97] this is impossible, as we have shown in [Plu99]. 

Polynomials over Z/n The ring over which the polynomials are defined is now 
Z/n. We assume that there is module Zmodn, which is parameterized by n and 
where the sum function of Z/n is exported. 

module Zmod43PolynomSUM is 
import Int . 

import Zmodn [n = 4] as Zmod4 . *** the identifiers id are used 

*** qualified as Zmod4.id 
sorts Polynoml, Polynom2, PolynomS . 

import PolynomSUM [Ring = Card, Polynom = Polynoml, + = Zmod4.+: ...] . 

import PolynomSUM [Ring = Polynoml, Polynom = Polynom2, + = +: . . .] . 

import PolynomSUM [Ring = Polynom2, Polynom = PolynomS, + = +: . . .] . 

op Main: PolynomS PolynomS -> PolynomS . 
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vars X y : PolynomS . 
eq Main(x, y) = +(x, y) 
endm 

The Main function defines the sum of polynomials over Z/4[a;i][a;2][a;3]- 
This example shows the possibilities for reuse code in OBJ-P. It is possible to 
give a new sum function (in this example from module Zmodn) and assign them to 
the function parameter + in the module PolynomSUM, while the code of PolynomSUM 
is unchanged. 

Summary The subtyping feature of OBJ-P allows to represent polynomials p G 
. . . [xm] in the supertype . . . [xn], {m<n) identical as in . . . [xm]- 
This is not possible in other programming languages. 

Furthermore, the subtyping feature enables the sum function to have only two 
arguments, instead of three (the third for the number of variables) as would 
usually be expected (cf. SACLIB [BCE+92]). 

Additionally, because of the overloading feature of OBJ-P, there is the same 
function symbol for the sum function over ring elements, monomials, and poly- 
nomials, although these sets of elements are represented by different types. 



3 Conclusion and Further Work 

We have presented the programming language OBJ-P, which has the special 
features of overloaded function symbols, set inclusion subtyping, and parametric 
polymorphic types. 

Additionally, in [Plii99] we have defined a type inference system and a corre- 
sponding type reconstruction algorithm, which allows us to omit the declarations 
of the function symbols and the variable declarations in OBJ-P programs. 
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Abstract. We report results of a joint project with France Telecom on 
the modelling of telephone services (features) using formal methodologies 
such as 00 ACT ONE, B and TLA+. We show how we formalise the 
feature interaction problem in a multi-view model, and we examine issues 
such as animation, validation, proof and verification. 



1 Introduction 

In this section we briefly introduce the need for formal methods in software 
engineering, the use of formal methods to help resolve the feature interaction 
problem, and the particular formal methods we adopt in our mixed-semantic 
model. 



1.1 Formality 

Many software engineers do not acknowledge the value of formality. In 1993, a 
major study [13] concluded by stating: “ . . . formal methods, while still immature 
in certain important respects, are beginning to be used seriously and successfully 
by industry to design and develop computer systems ...” We believe that for- 
mal methods are, five years later, just about ready for transfer to the industrial 
development of telephone features. Like all forms of engineering, one must al- 
ways compromise between quality and cost. In telephone systems, it appears 
that the cost of resolving interactions between features at the implementation 
stage is now (or will soon be) greater than the cost of developing formal features 
requirements models and eliminating many of the potential interactions before 
implementation begins. Formal methods in this domain should be regarded as 
an investment for the future. 

There are a wide and varied range of definitions of formal method which can be 
found in the majority of texts concerned with mathematical rigour in computer 
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science. The most common methods used for telephone feature specification are 
reviewed in [42], For the purposes of this paper we propose the following defini- 
tion: A formal method is any technique concerned with the construction and/or 
analysis of mathematical models which aid the development of computer systems. 
Formal methods are fundamentally concerned with correctness: the property that 
an abstract model fulfils a set of well defined requirements . In this paper, we are 
concerned with the construction of such requirements models. 

A formal model of requirements is unambiguous — there is only one correct 
way to interpret the behaviour being defined. Although the model must still be 
mapped onto the real world (i.e. validated by the customer), this mapping is in 
essence more rigorous than in informal approaches. Building a formal model re- 
quires a better understanding of the problem domain and a better understanding 
of how the problem domain is viewed by the customer. 

A major problem when using formal methods in software engineering is that 
much of the recent research places emphasis on analysis rather than synthesis. 
The means of constructing complex formal models is often overlooked in favour 
of techniques for analysing models. 

Re-usable analysis techniques will automatically arise out of re-usable composi- 
tion mechanisms. Formal method engineers need to learn techniques for building 
very large, complex systems. Such techniques have been followed, with various 
degrees of success, by programmers. In particular, object oriented programmers 
have evolved techniques which have been successfully transferred to the analysis 
and design phases of software engineering. Where better then to look for aid in 
the construction of large formal models? 



1.2 Feature Interactions 

A feature interaction is a situation in which system behaviour is specified as 
a composition of some set of features: each individual feature can meet its re- 
quirements in isolation but all features cannot meet their requirements when 
composed. 

The problem of feature interaction is a major topic in telecommunications where 
formal methods have been usefully applied. There is no single technique which 
addresses all the aspects of the problem, but the most common approaches that 
have been used to tackle the problem, at the requirements stage, are: SDL [29, 
30], LOTOS [18, 7, 17], state machine and rule based representation [19], and 
temporal logic[4, 3, 11], . 



1.3 Our Formal Models 

In our formal approach, interactions occur only when requirements of multi- 
ple features are contradictory. The complexity of understanding the problem is 
thus contained within a definition of contradiction in our semantic framework. 
We have argued that in most of the feature interaction examples found in pub- 
lished texts, there is no generally accepted standard formal definition of feature 
interaction[25, 43, 6, 9, 15]. In fact, most of the interactions which we studied 
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correspond to incomplete and informal requirements models. In other words, if 
the features were modelled better then we would be able to better understand 
what is and what isn’t an interaction. 

LOTOS (Language Of Temporal Ordering Specifications), see [40, 28], is a wide 
spectrum language, which is suitable for specifying systems at various levels of 
abstraction. Consequently, it can be used at both ends of the software develop- 
ment spectrum. Its natural division into ADT part (based on ACT ONE [16]) 
and process algebra part (similar to CSP [26] and CCS [37]) is advantageous 
since it provides the flexibility of two different semantic models for expressing 
behaviour, whilst managing to integrate them in a relatively coherent fashion. 
LOTOS provides an elegant way to specify services and to detect interaction 
among services; it allows the user to specify services in a compositional manner 
and it provides a set of tools such as LITE from the project LOTOSPHERE^, 
to assist in service engineering. Questions regarding fairness cannot be easily 
expressed or solved in LOTOS: modeling fairness requires us to state properties 
on traces, or a scheduling policy, and LOTOS has not yet integrated fairness 
constraints. 

We have used LOTOS in our project and compared the expressivity of different 
languages such as B, TLA+ and 00 ACT ONE LOTOS and on the availability of 
practical development environments for B and LOTOS. The style of specification 
plays a very important role and the approach of Gammelgaard [19] is automaton- 
oriented; their approach uses a specification language based on transition systems 
as predicates. The weakness of their solution relies on the partial view of details 
whereas a sound and semantically complete reasoning system is required. The 
solution using TLA [31, 24] borrows the initial idea from their model, but TLA 
has the advantage of a very carefully equipped proof system. Einally, as the 
temporal framework can be very expressive, we need a computer-aided proof 
environment and more generally applicable software environments based on these 
formalisms. 

Blow et Al. [3] and Middelburg [36] investigate the use of temporal logic for 
specifying services; Blom uses a temporal logic integrating the reactive and the 
frame parts for services. Middelburg introduces a temporal logic of branching 
time and restricts its expressivity to obtain a TLA-like logic. 

In fact, the integration of very different formalisms such as TLA, B and LOTOS 
is a way to improve service engineering. B is simple and a tool helps the user in 
developing specifications: we do not claim that B will solve the entire problem but 
it is very helpful in the building of requirements models for telecommunication 
services. As we emphasize B as a tool for developing services specifications using 
a theorem prover, another crucial element of B is its animator. Several problems 
are detected by animation which do not need to be resolved by the prover. We 
have experimented with B as a tool for service engineering, although it was not 
one of the original goals of the language. Another point is that B and TLA are 
very close, at least for the action part; we have studied the integration of B and 
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TLA [35] to re-use the B tools for TLA and to extend the scope of B through 
temporal features. 

Our paper is organized as follows. Section 2 describes our mixed model involv- 
ing different aspects of the formal development. Section 3 introduces service 
requirements. Section 4 gives details on the way we model services in TLA + ; we 
explain how our mixed views can be checked to be coherent. Section 5 concludes 
our paper. 

2 A Mixed Semantic Model 

We have shown the need for a mixed semantic model when specifying telephone 
feature requirements [22]. Such a model is used to provide three different client 
views: 

— An object oriented view which provides the operational semantics used during 
animation for validation, and the structuring mechanisms which are funda- 
mental to our approach. This view is formalised using an object oriented 
style of specification in LOTOS [20]. 

— An invariant view which allows the client to describe abstract properties of 
a system (or component) which must always be true. This view is formalised 
using B and leads to the automatic detection of many interactions [33, 34]. 

— A fairness view which allows the client to describe properties of the system 
which must eventually be true even though they have no direct control over 
them. A temporal logic provides an ideal means of specifying and verifying 
such requirements [23]. 



2.1 Objects and Classes 

Labelled state transition systems are often used to provide executable models 
during the analysis and requirements stages of software development [12, 14]. In 
particular, such models play a role in many of the object oriented analysis and 
design methods [5, 10]. However, a major problem with state models is that it 
can be difficult to provide a good decomposition of large, complex systems when 
the underlying state and state transitions are not fully understood. The object 
oriented paradigm provides a natural solution to this problem. By equating the 
notion of class with the state transition system model, and allowing the state of 
one class to be defined as a composition of states of other classes, we provide a 
means of specifying state transition models in a constructive fashion. Further, 
such an approach provides a more constructive means of testing actual behaviour 
with required behaviour. 

This state based view forms the basis on which we build our feature animations 
and permits behaviour validation in a compositional manner. However, such 
operational models are not good for formal reasoning about feature requirements 
[44]: for this we need to consider specification of state invariants and fairness 
properties. 
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2.2 Invariants 

Invariants are used to specify properties of a system which must always be true 
for reachable states. Within the object oriented framework we have three kinds 
of invariant: 

— Typing: By stating that all objects are defined to be members of some 
class we are in fact specifying an invariant. These invariants are verified 
automatically by our object oriented tools. 

— Service requests: Typing also permits us to state that objects in our system 
will only ever be asked to perform services that are part of their interfaces. 
These invariants are also verified automatically by the object oriented tools. 

— State Component Dependencies: In a structured class we may wish 
to specify some property that depends on the state of two or more of the 
components, and which is invariant. This cannot be statically verified using 
the object oriented tools, but it can be treated through a dynamic analysis 
(model check). Unfortunately, such a model check cannot be guaranteed 
when we have a large (possibly infinite) number of states in our systems. For 
this reason we need to utilise a less operational framework. By translating 
our state invariant requirements into B, we have been able to statically verify 
our state component invariants. 



2.3 Nondeterminism and Fairness 

TLA is a temporal logic introduced by Lamport [31] and based on the action- 
as-relation principle. A system is considered as a set of actions, namely a logical 
disjunction of predicates relating values of variables before the activation of 
an action and values of variables after the activation of an action; a system is 
modeled as a set of traces over a set of states. The specifier may decide to ignore 
traces that do not satisfy a scheduling policy such as strong or weak fairness, 
and temporal operators such as □ (Always) or O (Eventually) are combined 
to express these assumptions over the set of traces. Such fairness is important 
in feature specification and cannot be easily expressed using our state based 
semantics. The key is the need for nondeterminism in our requirements models. 
Without a temporal logic, nondeterminism in the features can be specified only 
at one level of abstraction: namely that of an internal choice of events. This 
can lead to many problems in development. For example, consider the specifica- 
tion of a shared database. This database must handle multiple, parallel requests 
from clients. The order in which these requests are processed is required to be 
nondeterministic. This is easily specified in our object model. However, the re- 
quirements are now refined to state that every request must be eventually served 
(this is a fairness requirement which we cannot directly express in our semantic 
framework) . The only way this can be done is to over-specify the requirement by 
defining how this fairness is to be achieved (for example, by explicitly queueing 
the requests). This is bad because we are enforcing implementation decisions at 
the requirements level. With TLA we can express fairness requirements without 
having to say how these requirements are to be met. 
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2.4 Composition Mechanisms 

Composition is primarily a question of re-use: given two already specified com- 
ponents, how can we create a new component from those given? A composition 
mechanism defines a creation mechanism which is reusable (i.e. can be applied 
to different sets of components). Clearly, we have to be more precise as to the 
meaning of a component. From the customer’s point of view, and hence at the 
requirements level of abstraction, a component must be some piece of behaviour 
which can be validated independently. In other words, a component must be able 
to be seen as a model of behaviour in its own right. We give an overview of the 
composition techniques from each of our three different view points and argue 
that a user oriented view would be best during requirements capture: 

(1) Object oriented composition in LOTOS: 

LOTOS [8] is made up from an abstract data type part [32], and a process 
algebra part [26]. Clearly there are ways of composing behaviours in each of 
these models. However, the object oriented composition is at a higher level of 
abstraction. We do not compose with language operators; rather we compose 
using object oriented concepts. 

(2) Invariant composition (in B): 

B [1] is a model-oriented method providing a complete development process from 
abstract specification towards implementations through step-by-step refinement 
of abstract machines. An abstract machine describes data, operations and in- 
variant preserved by every operation. Abstract machines are composed by con- 
junction of its invariants and combination of operations. The resulting abstract 
machine may either preserve the resulting invariant, or invalidate it. The vi- 
olation of the invariant is interpreted as an interaction [34] and is in fact an 
interference between operations: it is a way to detect interaction among services 
specified as abstract machines. The main advantage of B is that it is supported 
by a powerful sofware environment, namely the Atelier B [39]. The B method [1] 
is itself a conceptual tool for specifying, refining and developing systems in a 
mathematical and rigorous, but simple way. 

(3) Fairness composition (in TLA): 

The composition of fairness assumptions in TLA is done at a high level of ab- 
straction and is preserved through the composition process. A model for a TLA 
formula is an infinite trace of states, and a TLA specification is made up of three 
parts: 

— the initial conditions, Init, 

— the relation over variables, Next(x,x'), and 

— the fairness constraints, /\ WF,(A)A /\ SF,(A) 

AeWFA A6SFA 

(we require that A Next(x, x'), for all A in WFA or SFA to ensure the 
machine-closure property) . 

Fairness constraints remove models or traces that do not satisfy them. A service 
is characterized by a set of flexible variables, initial conditions, a next relation 
over variables and fairness contraints. When combining two services, we increase 
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the restrictions over traces but we extend the models by adding new variables. 
TLA provides an abstract way to state fairness assumptions but in our approach 
this unfriendly syntax is hidden from the customer. We encapsulate fairness 
within each object as a means of resolving nondeterminism due to internal state 
transitions. This is a simple yet powerful way for the fairness to be structured 
and re-used within our requirements models. 

(4) Feature composition (user conceptualisation): 

In an ideal world, feature composition would be done using concepts within the 
clients’ conceptual model of their requirements. Clients cannot be expected to 
express themselves using formal language operators. This does not mean that 
they cannot express themselves formally. It is the role of the analyst to map the 
clients’ composition concepts onto composition methods in the formal model. 
For now, we are forced to communicate through the object oriented models 
(which could be argued to be client friendly). In the future we hope to develop 
a modeling language based on client concepts rather than modeling language 
concepts. 

3 Requirements for Features 

3.1 Requirements Modeling: Cnstomer Orientation 

Requirements capture is the first step in the process of meeting customer needs. 
Building and analysing a model of customer needs, with the intention of passing 
the result of such a process to system designers, is the least well understood 
aspect of software engineering. The process is required to fulfil two very differ- 
ent needs: the customer must be convinced that requirements are completely 
understood and recorded, and the designer must be able to use the requirements 
to produce a structure around which an implementation can be developed and 
tested. In this paper, we concentrate on the customers’ point of view, whilst not- 
ing that the object oriented approach does lend itself to meeting the designers’ 
needs [21]. We advocate such a customer oriented approach since it is generally 
agreed that customer communication is the most important aspect of analysis 
[27, 38, 41]. 

The fundamental principle of requirements capture is the improvement of mutual 
understanding between customer and analyst, and the recording and validation 
of such an understanding in a structured model. The successful synthesis of a 
requirements model is dependent on being able to construct a system as the 
customer views the problem. [2, 25] illustrate this point with respect to feature 
models. 

3.2 Feature Interaction: What’s New? 

We concentrate on the domain of telephone features, where the problem has 
been acknowledged for many years! [6, 9]. Figure 1 illustrates the problem within 
the formal framework which we adopt throughout this paper. We note that the 
means by which features are composed is not specified. 
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Fig. 1. Feature Interaction: A formalisation 

Features are observable behaviour and are therefore a requirements specifica- 
tion problem [45]. Many feature interaction problems can be resolved through 
communication with the customer during requirements capture. Given a fea- 
ture requirements specification which is not contradictory, interaction problems 
during the design and implementation will arise only through errors in the re- 
finement process. Certainly the feature interaction problem is more prone to 
the introduction of such errors because of the highly concurrent and distributed 
nature of the underlying implementation domain, but this is for consideration 
a/fer each individual feature’s requirements have been modelled and validated. 
We have extended the work given in [25], where the composition of features was 
done in an ad-hoc fashion, by identifying and formalising re-usable composition 
mechanisms. The configuration of multiple features will be shown to depend 
on the way in which individual features are composed with POTS (plain old 
telephone service). 

Features are requirements modules and the units of incrementation as systems 
evolve. A telecom system is a set of features. Having features as the incremental 
units of development is the source of our complexity. An understanding of feature 
composition helps us manage the four main sources of this complexity — 

(1) State explosion: 

Potential feature interactions increase exponentially with the number of features 
in the system and traditional model checking techniques cannot cope with the 
complexity. The fundamental problem is that analysis cannot be done composi- 
tionally. We argue that compositional (re-usable) analysis depends on having a 
formal understanding of the composition mechanisms. This is the main goal of 
this work. 

(2) Chaotic Information Strncture In Sequential Development Strate- 
gies: 

The arbitrary sequential ordering of feature development is what drives the in- 
ternal structure of the resulting system. As each new feature is added the feature 
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must potentially include details of how it is to be configured with all the fea- 
tures already in the system. Consequently, to understand the behaviour of one 
feature, it is necessary to examine the specification of all the features in the 
system. All conceptual integrity is lost since the distribution of knowledge is po- 
tentially chaotic. At the moment this is certainly true. However, we believe that 
we can control the distribution of this configuration knowledge by containing it 
within a re-usable set of configuration mechanisms. 

(3) Implicit Assumption Problem: 

Already developed features often rely on assumptions which are no longer true 
when later features are conceived. Consequently, features may rely on contradic- 
tory (implicit) assumptions. This is a great source of interactions. We propose 
forcing the specifiers to formalise their (explicit) assumptions, by forcing them 
to use a certain set of configuration mechanisms. 

(4) Independent Development: 

Traditional approaches require a new feature developer to consider how the fea- 
ture operates with all others already on the system. Consequently, we cannot 
concurrently develop new features: since how the new features work together 
will not be considered by either of the two independent feature developers. This 
problem is amplified if feature developers can configure features in any way that 
they wish. 

4 Feature Interaction: An Incremental Development 
View 

In figure 2, we take POTS as one requirement model. We note that to extend 
this base requirement with a new feature we must define a means of composing 
POTS with this feature, or, as illustrated in the diagram, use a previously defined 
mechanism. Unfortunately, for two different features there is no guarantee that 
we can use the same composition mechanism. Furthermore, for each composition 
we may require an additional restriction (called the composition invariant) on 
the way in which the parts are configured in order to gurantee that individual 
requirements are met. 

Given such a composition technique we must now address the problem of in- 
tegrating Feature 1 and Feature 2 in the same set of requirements. In figure 3, 
we see that an interaction occurs if the invariants introduced by the two fea- 
tures and/or the two composition mechanisms are contradictory. Properties are 
required to be preserved through the composition process; the multi-view ap- 
proach allows us to integrate the view of invariants (using B) and the view of 
fairness (using TLA). 

We note that there are many different ways in which we may wish to compose 
the three components. The four most obvious structures are: 

— Composel (compl (POTS, f eaturel) , feature2), where we compose the 
feature 2 with the component which results from a composition between 
POTS and the featurel. 
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Fig. 2. Incrementing POTS 



— Compose2(comp2(P0TS,feature2) , featurel), where we compose the 
featurel with the component which results from a composition between 
POTS and the feature2. 

— Composes (POTS , compS (featurel , feature2) ) , where we first compose the 
two features and then compose this new component with POTS. 

— Compose4(P0TS, featurel, feature2), where we define a new composi- 
tion mechanism which acts on all three components. 

The feature composition problem is certainly difficult (even when there are only 
2 features); now we argue that having formal requirements models makes it 
manageable, but we need to develop a methodology for composing features 



4.1 Modelling Services 

A service is an extension of POTS, - the basic service - , providing functionality 
to the customer for interacting with the switch and the billing system. The 
modeling of services is based on the view of services as processes altering a set 
of calls. The current state of a service is characterized by an invariant over calls. 
A call is a structure that manages and describes the current parameters as the 
caller, the callee, the call state, the paying party . . . However, a call may be 
extended into another call by operations over calls such as fusion, completion 
etc. This means that calls are central concepts in our modelling but this makes 
the modelling more flexible. More generally, a call is a structure recording the 
current participants, the connection, the state, the billing. We use the TLA + 
syntax for writing service specifications, as follows: 
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Fig. 3. Integrating Two Features 



COEF = 0..100 used to define the percentage for contributing in the billing 
CALLS = 

[party : SUBSET USERS, 
linkcall : SUBSET {USERS x USERS), 

paycall : subset { USERS x USERS x USERS x COEE x TIME x TIME) 

com : SUBSET {USERS x USERS), 
state : CALLSTATES] 

Variables such as calls, phones, tones, messages, billings, services are typed ac- 
cording to the following typing invariant. We define it and operations or actions 
which have to preserve it. 

Typing.VariablesMnvariant = 

A calls € CALLS 

A phones € [USERS PHONESTATES[ 

A tones € [USERS TONESTATES] 

A messages € [USERS subset STRING] 

A billings € USERS x COEE x USERS x COEE x TIME x TIME 

A services € USERS subset SERVICES 



Now, we can incrementally add new operations that are either activated by 
users or customers, or by the telecom systems. The basic service, called POTS, 
provides the following operations : 
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Communicating iink 

Fig. 4. View of services through calls 



— off-hook 

A user can off hook the phone because he/she wants to call somebody 
somebody else is calling him/her. The switch will reply either by 
sending a dialtone or by starting the communication. 

OFFHOOKCALLING{Xcaller) = 

A phones' = [phones \except[X caller] = "offhook"] 

A tones' = [tones \except[X caller] = "notone’’] 

A UNCHANGED < tones, Calls, messages, billings, t > 

Y wants to call somebody, namely X, in the call Xcall 

OFFHOOKRINGING{X,Y,Xcall) = 

A Xcall € calls 
A {X,Y} C Xcall. party 
A tones [V] = ’’ringing" 

A tones [V] = ’’ ringbacktone" 

A tones' = [[tones !except[V] = ’’ notone”] !EXCEPT[y] = ’’ notone"] 
Aphones[Xcalled] 7 ^ "offhook” =i> phones' =[phones\EXCEPT[Xcaller] 

"offhook"] 

A p/iones [VeafZed] =" offhook" =► phones' =phones 

A UNCHANGED < calls, messages , billings, t, services > 
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Xcalled is called by somebody else and Xcalled is ringing; 
the operation is done by the user 

— on-hook 

— dial 

— communication 

We have added an event which is executed infinitly often to model the time, 
since we need to specify the starting point of a call and the ending point of a 
call, for instance. 

TICTAC = At' = t + 1 

AUNCHANGED < tones, calls, messages, billings, phones, services > 

The global system, called POTS, is operationally defined as a disjunction of 
relations over primed and unprimed variables (thanks to TLA), 
we define the set of possible events of the basic system, called 

POTS 

EventsBasicSystem = U OffHookCallingEvents 
U OffHookRingingEvents 
U OnHookFirst Events 
U UpdateCallsEvents 
U FinalUpdateCallEvents 
U OnHookLastEvents 
U DialEvents 
U SendingToneDialevents 
U DialToneEvents 
U CommunicationOkEvents 
U CommunicationDownEvents 
U CommunicationBusyEvents 
U OnHookDownEvents 
U OnHookBusyEvents 
U CleanJDown-CallsEvents 
U CleanJBusy^GallsEvents 
U Cleanse ompleted^C alls 
U {TICTAC} 

Now we apply the ’Next’ operator to obtain the next relation 
for the operational semantics. 

NextBasieSystem = Next (EventsBasicSystem ) 

TLA^ requires that we specify the variables of the system. 
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VarsBasicSystem = < messages, calls, phones, tones, billings, t, services > 



Finally, we assume that every event is executed under the weak fairness 
assumption. 



FaimessBasicSystem = WF {VarsBasicSystem, EventsBasicSsy stem) 

We have defined an operator assigning a formula from a set of formulae; it allows 
us to get a simpler way to specify, since we have to give the set of possible events 
and to apply it on the current set of events. Now, the bare basic service is simply 
specified by the following formulae. 



InitBasicSystem = 

A calls = {} 

A Vp e USERS 
A Vp e USERS 
A billing = {} 

A Vp e USERS 
A Vp e USERS 
At = 0 



: phones[p] = "onhook" 

: tones [p] = " notone" 

: messages[p] = ”” 

: services [p] = {" basic" } 



SpecificationBasicSystem = 

A InitBasicSystem 

A 0[NextBasicSystem]^{ VarsBasicSystem} 
A FaimessBasicSystem 



The basic system provides the user with the basic functionality required for call- 
ing somebody else. At this stage, a user ‘X‘ can call only one user ‘Y‘; if we 
increase the calling possibilities, we add functionality related to a new service. 
Increasing the basic functionalities means that we allow the user additional op- 
erations; if N is the relation characterizing the current service, then a new func- 
tionality is obtained by adding another relation, namely F, as follows: N V F. 
Composing is reduced to logical operations over relations on states, but we may 
have transformations to do on relations. The user view of the service is like a 
reactive system. The modules for POTS have a very restricted scope, since the 
functionality of each is very limited. 



4.2 Adding a New Service 

The user’s view deals with operations such as subscribing, unsubscribing, paying, 
billing, and a service is generally characterized by at least two operations that 
enable or disable the service, when the user has subscribed; for instance the 
service, called CCBS, allows the user/subscriber to be informed, when another, 
whom he is calling and busy, becomes idle. 
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CCBS.activation{X) = 

AX e USERS 

AX (f: CCBS_sub 

A CCBS.sub' = CCBSjsub U {X} 

A CCBSJieap' = [x G DOM CCBSJieap U {X} 

^ w X = X THEN {} ELSE CCBSJieap[x\] 

A UNCHANGED < HsUof^unchanged.variables > 

CCBSJnhibition{X) = 

AX e USERS 

AX e CCBS.sub 

A CCBS.sub' = CCBSjsub - {X} 

A CCBSJieap' = [x G DOM CCBSJeap - {X} CCBSJeap[x]] 
A UNCHANGED < HsUof^unchangedjvariables > 

We modify the basic service, by strengthening operations of the callee; moreover, 
CCBS is a very interesting service, since it requires the expression of a fairness 
constraint. A first step is to analyse what is shared by CCBS and POTS and 
what is private or local for CCBS. We introduce two variables that will manage 
the current subscribers of CCBS and the waiting users for re-calling somebody. 

VARIABLES 

CCBS-Sub, set of users that have subscribed to CCBS 
CCBSJeap function defining heaps 

The typing invariant of CCBS declares the role of those variables. 

INVARIANT.CCBS = A CCBS.sub C USERS 

A CCBS.heap G [USERS subset USERS] 

The next step is to define ’’side-effects” on events of the basic service. CCBS 
requires an event for dequeing recalls for users having subscribed to CCBS; we 
call it CCBS.Dequeue{X, Y, Xeall), and it requires a fairness assumption. 

CCBS.Dequeue{X, Y, Xeall) 

A tones [X] = " notone’’ 

A phones [X] = "onhook” 

A phones [Y] = "onhook” 

A tones [F] = "notone" 

AX G CCBS.sub 

A tones' = [[tones !except[X] = ’’ ringing’’]!EXCEPT[F] = ’’ringing’’] 

A Xeall. state = ’’ busyCCBS" 

A Xeall G calls Cl CCBSJeap[Y] 

A{X, F} C Xeall. party 

A LET neweall = choose c . A c G calls Cl CCBS.heap[Y] 

A c.state = "waiting" 

A c.com = Xcall.com 
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A c.paycall = Xcall .paycall 
A c.linkcall = Xcall. linkcall 
IN A calls' = calls — {Xcall} U {newcall} 

A CCBSJieap' = [CCBSJieap !except[F] = @ - {Xcall}] 

A UNCHANGED < mcssagcs , phoncs , billings, calls, t, CCBS^sub > 

Now, we modify two events in the specification of the basic service, namely 
the COMMUNICATION-BUSY, which manages calls when they are busy, and 
OFFHOOKRINGING, which manages when somebody is called and this phone 
is ringing. Hence, we modify events of the basic service and add new events. 
CBS.COMMUNICATION.BUSY{X, Y, Xcall) = 

A Xcall G calls 
A Xcall. state = "waiting" 

A Xcall. party = {X,Y} 

AX ^ Y 
A 3 c e calls : 

A c jt- Xcall 

A phones[Y] = "offhook" 

A tones[Y] = "talking" 

A Y G c. party 
A c. state = "active" 

A X ^ c. party 
A phones[X] = "offhook" 

A tone.s[X] = "dialling" 

A LET newcall = 

CHOOSE c . A c G CALLS \ calls 

A c. state = "busyCCBS" 

A c. party = {X,Y} 

A c.com = {} 

A c.paycall = {} 

A c.linkcall = {< X,Y >} 

IN A calls' = calls — {Xcall} U {newcall} 

A CCBSJieap' = [CCBSJieap !except[F] = @ u {newcall}] 

A tones' = [tones !except[X] = "CCBStone"] 

A UNCHANGED < phoncs , mcssagcs, billings, t, CCBS^sub > 

CCBS.OFFHOOKRINCINC{X, Y, Xcall) = 

A Xcall G calls 
A {X,Y} C Xcall. party 
A V tones [X] = "ringing" 

V tones [Y] = "ringing" 

A < X,Y > Xcall. linkcall 
AX e CCBS-sub 

A p/iones[X] yf "offhook" A phones' = [p/iones !except[X] = "offhook"] 
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A tones' = [tones ! except [X] = " notone’’] 

A p/iones[X] = "offhook" => A phones' = [p/iones ! except [F] = "offhook"] 

A tones' = [tones !except[F] = "notone”] 

A UNCHANGED < calls, messages, billings, t, CCBS.sub, CCBSJieap > 

Now, events of CCBS are defined as follows: 

CCBS.events = 

U UNION 1 ( CCBS.activation , USERS) 

U UNION ilcCBSJnhlbition, USERS) 

U UNION 2 {CCBS.Dequeue, USERS, USERS, CALLS) 

U UNION 2{CCBS.0EEH00KRINGING, USERS, USERS, CALLS) 

However, POTS is modified by the service CCBS, by restricting COMMUNI- 
CATION events when the called user is busy; in fact, it leads to an enqueueing 
of the busy called user. We define a restriction of the POTS service which is 
modified and then we define a way to instantiate a system, defined by a set of 
events. 

CCBS-Restriction{System} = 

System — CommunicationBusyEvents 

U UNI0N3{CCBS.C0MMUNICATI0N_BUSY, users, users, CALLS) 



CCBSJnstance{System) = CCBS.Restriction{System} U CCBS^events 

Properties of CCBS tells us that when somebody (X) calls somebody else (Y) 
and, if Y is busy, then when Y is put onhook, the system will recall X and Y. X 
and Y will ring together, when fairness constraints are ensured. 

GCBSPlusBSEvents = GCBSJ,nstance{EventsBasicSystem) 

SpecCCBSPlusBS = 

A InitBasicSystem 
A InitCCBS 

A □ [Next{CCBSPlusBSEvents)]. < VarsBasicSystem, VarsCCBS > 

A WP{VarsCCBS U VarsBasicSystem, GCBSPlusBSEvents) 



THEOREM SpecCCBSPlusBS =A □ INVARIANT.CCBS 

I 

if ’X’ calls ’Y’, while ’Y’ is busy and ’X’ has subscribed ’CCBS’, 
then eventually ’Y’ is appended to the waiting heap for ’X’ 

THEOREM 

SpecCCBSPlusBS =A 

(□ {Calling{X,Y) A (X e CCBSsub A Busy{Y)) {X e CCBS.heap[Y])) 
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if ’X’ is in the waiting heap of ’Y’, and if ’Y’ has subscribed COBS, 
while ’X’ is infinitly often busy, then eventually ’X’ and ’Y’ will 
ring both 
THEOREM 

SpecCCBSPlusBS => 

(X e CCBS.heap[Y] A □ (F G CCBS^ub) A D O ^ Busy{X) 

{RingingBoth{X,Y)) 



We have expressed the formal modelling of the basic service and of CCBS; now, 
we have to verify theorems and to validate the specifications. 



4.3 Coordinating Views 

Our model of services in TLA+ can be verified and validated using the Atelier 
B toolkit. This means that we can verify invariants using a coding of our TLA 
specifications in B. Services can be viewed as abstract machines or as TLA+ 
modules. The coordination of views means that properties that are observed 
in each model are not contradictory. Our model of services in TLA+ can be 
verified and validated using the Atelier B toolkit, since our TLA+ specifications 
are made up of imperative actions; these actions are written x' = f{x) where 
f{x) is an expression codable in B. Services in TLA+ can be viewed as B abstract 
machines, but this leads us to forget fairness issues. However, it means that we do 
get a framework for animating and verifying the B view of a TLA+ specification. 
It is clear that our approach is based on the use of a theorem prover but one 
can also use a model-checking-based tool. 



4.4 Validation and Verification 

We give a graphical representation of our formal models. The graphical syntax 
is informally explained and, where appropriate, we comment on how the formal 
meaning is captured using LOTOS, B and TLA. The semantics are clearly based 
on a state transition model and, as such, are easily communicated to the client 
through a process of animation. 

We have specified a simple (POTS) client-oriented model of phone behaviour. 
This is sufficiently complex to illustrate the graphical syntax, in figure 5, being 
employed to communicate the formal semantics with the client. 

The following aspects of the specification should be noted: 

The header 

The name of the class (Phone) being specified is given first in the header of the 
diagram. The other classes which are used in the specification of the new class 
are listed after the USING keyword: the Phone uses classes signal and on-off. 
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Phone USING ID, signal, on-off 




I Fairness: WF(noconnection) State Invariant: (regard=on) =>( (listen = ringing) or (listen = silent) ) 



Fig. 5. The Phone 



The interface 

The interface to the class is represented by the connections at its boundary. Each 
connection corresponds to a service. In this case there are 5 services, namely: 
lift, drop, dial, listen and regard. Lift, drop and dial correspond to 
transformer services. When requested they result in a state transition. Listen 
and regard correspond to accessor services. When requested they return a value 
to the service requester. The type of the value returned is identified by a class 
name: listen, for example, returns a signal value. Services can be parame- 
terised by a set of input classes: dial, for example, is parmeterised by an ID 
value. Services can be polymorphic on their input classes. In other words, a class 
can have two different services of the same name provided they can be distin- 
guished by the types of their input parameters. The user of the class sees the 
class as a black box. The internal state of the class is encapsulated by its inter- 
face. The only access to state information is through the accessors. (There is one 
more type of service which is not illustrated by the Phone: the dual service is a 
combination of a transformer and an accessor: it returns a value and results in 
a state transition.) 

Communication 

Communication between an object server and its environment of clients is taken, 
unless otherwise specified, to be synchronous. This may lead to situations in 
which services are requested but are not enabled by the server. We note that 
accessor services are always enabled. Duals and transformers may not always be 
enabled: if a client requests a service which is not enabled then it is the client’s 
responsibility to avoid a potential deadlock situation. One role of fairness in our 
models is to gurantee that services will be eventually enabled. 
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The operational semantics 

There are eight states in the Phone class. Thus every Phone instance (object) 
must be in one of these eight states. These states are represented as nodes in 
the inside of the class boundary. For each state, each of the accessor values must 
be defined. To aid compositional specification techniques, and to facilitate the 
specification of classes with large (potentially infinite) numbers of states, we can 
define a class to be structured as a set of component classes. Then, these internal 
state values can be used to define the external accessor values. This provides 
a degree of implementation freedom and emphasises that internal details are 
hidden to the outside. In the Phone example, there is no structure definition as 
the number of states is manageable without one. The initial state of an object 
on creation is specified by a bold pointer which does not originate from another 
state. Hence, a Phone always starts on and silent. The state transitions which 
occur in response to an external service request are represented by solid pointers 
from old to new states. 

Invariants 

State invariant properties define restrictions on the possible sets of component 
values. For example, as it is shown in figure 5, we may require that when onhook 
the Phone must be ringing or silent. These properties are verified, for more 
complex cases, using B: by checking that all transitions are closed with respect 
to the invariant it is not necessary to examine every single reachable state (which 
we can do directly with the simple Phone model). Note that the state invariants 
specified in this way are explicit requirements of the client that must be respected 
by the model. A specification where the invariants are not true is said to be 
contradictory. 

Nondeterminism 

Nondeterminism is formalised as internal state transitions that may occur in- 
dependent from external service requests. These are represented by (possibly 
labelled) dotted pointers from old to new states. For example, when off and 
connecting the Phone user has no control over whether the number they are 
trying is busy, free or if noconnection is possible. These three cases are spec- 
ified using internal actions (labelled appropriately). The difference between in- 
ternal and external actions specifies a point-of-view onto a class (and the objects 
of the class). In this paper, our models specify the Phone user’s point of view (or 
abstraction). The way in which the telephone network interacts with the Phone 
is abstracted away from in the form of nondeterministic transitions. Certainly, it 
is necessary to specify other points of view when modelling the whole telephone 
network. Our modular approach lets us work with different abstractions and 
then helps us to integrate these abstractions into a complete specification. This 
is beyond the scope of this paper, which concentrates on user requirements. 

Fairness 

Liveness conditions can be specified on the nondeterministic events in the model. 
For example, we may require that when off and connecting the user does not 
wait forever for a state transition if they refuse to drop the phone. This must 
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be specified in a separate TLA (temporal) clause. In figure 5, we specify weak 
fairness on the noconnection action. 

(In)finite processes 

A Phone is an infinite process. In later examples we specify finite behaviours 
which EXIT after some specific behaviour is fulfilled. A Phone is said to be of 
type NQEXIT. 

A new feature: black list 

The Black List feature has a similar function to originating call screening, but 
restricts incoming rather than outgoing calls. The idea is that you can store 
a list of numbers that you know you do not wish to talk with and then your 
phone does not ring when such numbers are the source of an incoming call. Our 
specification of this feature is illustrated in figure 6. 



Phone+BlackList= Phone|[dialln]|Blacklist 




Fig. 6. Black list 

Again, we have some comments to make with regard to this feature model: 

Composition Re-Use 

The composition is precisely that seen for a similar, better known, CallerlD 
feature: there is internal synchronisation on the dialln event and the system 
depends on an action refinement in the network to carry the new identification 
data using a diallnFrom action. 

Phone refinement 

Unlike CalllD, not all diallnFrom actions result in a dialln action: the blacklist 
filters out all incoming dials which are stored in the list of numbers in its state. 
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However, like Call ID, from the point of view of the user the new system is a 
refinement of the old phone — the only difference is the resolution of some of 
the nondeterminism in the original phone model. 

Weak fairness guarantees eventuality 

We require weak fairness on the dialln event in the BlackList component. 
In the BlackList component we see that after a dialInFrom event, the exter- 
nal services removelD and addID may not be enabled until a dialln action is 
performed, in the case where the number is not black listed. However, weak fair- 
ness on dialln guarantees that this transition will eventually occur. Thus, we 
guarantee that the telephone user will not be deadlocked if they wish to add or 
remove a number from the blacklist because of an incoming call. 

Localisation 

At first glance, this feature seems to be loeal. All other users of the telephone 
system can remain unaware of this particular feature at any given phone. How- 
ever, we have abstracted away from an implementation detail which has global 
effect: what signal should a caller hear if they telephone someone who has black 
listed them? There are a number of choices: 

— A new type of signal telling them that they have been blacklisted. 

This may not be acceptable from a social point of view — do you really want 
someone to know that you don’t wish to talk to them. 

— A noconnected signal. 

This may not be acceptable since the caller may misinterpret the signal as 
saying that the number they are dialling is impossible to connect. Further- 
more, an intelligent user may realise why they are unconnected, which brings 
us back to the first problem. 

— A busy signal. 

This may be unacceptable since the caller may continue dialling because 
they think the person they are trying to contact will be available as soon as 
their current call is completed. 

— A ringing signal. 

This seems to be the most acceptable choice, and in our network model we 
specified the feature in this way. Thus the blacklist service required only loeal 
change to the telephone user which requested this service. All other users 
retain their original behaviour. 

It is only through animation that a user can be expected to understand such 
choices and help the designers to resolve the nondeterminism. 

5 Conclusion 

The problem of telephone feature interaction is just a particular instance of a 
general problem in software engineering. The same problem occurs when we con- 
sider inheritance in object oriented systems, sharing data in distributed systems, 
multi-way synchronisation in systems of concurrent processes, etc. . . However, 
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the problem is particularly difficult in telephone systems because features are 
the increments of development. 

We have shown the importance of re-usable composition mechanisms. Although 
our work is targeted towards the client during requirements capture, we believe 
that the same models could be used during design and at the network level. 
We support the principle of developing re-usable analysis techniques based on 
re-usable synthesis mechanisms. The object oriented approach can be extended 
to include a classification of feature types and we hope to map this onto a formal 
algebra for feature development. 

We have used a graphical notation for communicating with the customer. How- 
ever, our graphics are based on formal notations of languages, which may be 
difficult for the customer to understand. This work was very helpful in study- 
ing the complementary nature of different formalims. Logical formalisms such 
as B or TLA are really suitable for logical analysis of services based on proof 
techniques. Animation is made easier by automata-based representations. 

This work is dependent on the different view points and the different semantic 
models. The integration of these semantics and the development of user-oriented 
tools is the most important element of our current, and future work. Finally, the 
integration of refinement-based reasoning is an important point to develop, with 
experiments in other domains. 
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Abstract. The verification system PVS is used to obtain mechanized 
support for the formal specification and verification of concurrency con- 
trol protocols, concentrating on database applications. A method to ver- 
ify conflict serializability has been formulated in PVS and proved to be 
sound and complete with the interactive proof checker of this tool. The 
method has been used to verify a few basic protocols. Next we present a 
systematic way to extend these protocols with new actions and control 
information. We show that if such an extension satisfies a few simple 
correctness conditions, the new protocol is serializable by construction. 



1 Introduction 

Concurrency control protocols [SKS97, U1188], when applied to databases, man- 
age the concurrent access to a database by multiple users or processes. Access is 
performed by means of transactions, consisting of a number of actions (such as 
reads and writes of data items). This access has to be both correct, i.e. always 
leaving the database in a consistent state, and efficient, i.e. providing a good 
overall performance. One of the most important correctness notions is serializ- 
ability of transactions, which is the main topic of this paper. 

It is prominently difficult to achieve both correctness and efficiency at the 
same time. The most popular database protocol, the Two Phase Locking protocol 
(2PL), is simple and ensures serializability. Although it became a commercial 
standard in the seventies, it has been criticized for low performance and the 
possibility of deadlock (see, for instance, [Tho93]). A number of more efficient 
database protocols has been suggested, often on top of basic protocols such as 
2PL. Newly developed protocols are becoming increasingly complex, and their 
correctness becomes difficult to ensure. Specification and reasoning are often 
very informal, which easily leads to ambiguous specifications. All these factors 
make the understanding and the use of these new protocols difficult and they 
increase the danger of incorrect protocols. 

To address this problem, observe that many database protocols can be mod- 
eled as variations of a few basic concurrency control protocols. Although these 
variations can be obtained in different ways, they can often be considered as 
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extensions of a basic protocol. An extension of a database protocol is a protocol, 
which includes more control information (such as timestamps and versions) and 
corresponding new actions. 

The aim is to obtain the correctness of extensions from the correctness of a 
basic protocol. Here we focus on only one important correctness notion, namely 
serializability. An interleaved execution of a number of transactions is said to be 
serializable, if it has the same effect on a database as some serial execution of 
these transactions, i.e. an execution which has no interleaving between actions 
of different transactions. Deadlocks are assumed not to occur (as e.g. [U1188]), 
since they do not influence serializability. 

Given some set of basic concurrency control protocols, we propose to prove 
the correctness of extensions of these protocols, using the following strategy: 
a) Prove correctness (i.e. serializability) of the basic protocols, b) Derive the 
correctness of the extensions in a systematic way, using some assumptions on 
their construction. Ideally, this should be done in a structured way, using some 
mechanical support. The aim of our paper is to suggest a method to implement 
this strategy. Therefore, we address the following questions: 1) How to obtain 
mechanical support for specification and verification? 2) How to model concur- 
rency control protocols? 3) How to formalize serializability? 4) How to verify 
serializability? 5) How to formalize protocol extensions and which conditions 
are needed to ensure their correctness? 

1) Mechanical support. To get mechanical support, we use a higher-order 
interactive theorem prover, since notions like serializability are easily expressed 
in a property-oriented, assertional, way. To express general properties about 
these notions, that hold for all protocols, a higher-order logic is needed. Since 
we would like to use arbitrary data types and not restrict ourselves to finite 
state systems, completely automatic verification is not feasible. Although there 
are several verification systems that satisfy our requirements, we have chosen 
to use PVS [PVS], because it has a convenient specification language and is 
relatively easy to learn and to use. 

The specification language of PVS is a strongly-typed higher-order logic. 
Specifications can be structured into a hierarchy of parameterized theories. 
There is a number of built-in theories and a mechanism for constructing ab- 
stract datatypes. The PVS system contains an interactive proof checker with, 
for instance, induction rules, automatic rewriting, and decision procedures for 
arithmetic. It allows users to construct proofs interactively, to discharge simple 
verification conditions automatically, and to check proofs mechanically. 

2) Specification of protocols. To model a particular protocol in PVS, we 
define two types: 1) Actions, such as read and write, and possibly additional 
actions necessary for the adjustment of the control information 2) States, rep- 
resenting control information (locks, timestamps, etc.); and two predicates: 3) 
Effect, defining how a state is changed after applying a particular action 4) 
Pre, defining which actions are allowed in a particular state, and which are not. 

3) Serializability notions. A schedule is a sequence of actions by transac- 
tions. Intuitively, a schedule is considered to be correct, if it is equivalent to some 
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serial schedule. Serial schedules are those which have no interleaving between ac- 
tions of different transactions. There are different ways to define equivalence of 
schedules. The most intuitively appealing one leads to the notions of view equiv- 
alence. Informally, two schedules are view equivalent iff each transaction in these 
schedules reads the values written by the same transaction. A schedule is said 
to be view serializable, if it is view equivalent to some serial schedule. Testing 
view serializability is NP-complete [Pap79], and therefore this notion is difficult 
to use in practice. Another form of schedule equivalence is conflict equivalence, 
leading to conflict serializability. Two schedules are conflict equivalent iff one of 
them can be transformed into the other by a sequence of swaps of non-conflicting 
actions. Testing conflict serializability has a quadratic complexity, and therefore 
the majority of existing database protocols ensures not just view serializability, 
but the stronger notion of conflict serializability. 

We formalize the notions of conflict and view serializability, and prove, that 
any conflict serializable schedule is also view serializable. This relation is well 
known but has never been checked mechanically. In fact, there is no standard 
definition of view serializability in the literature. Here we combine the informal 
intuition of [SKS97] with the reads-from relation of [Vid91]. 

4) Method of verification. A traditional method for proving conflict se- 
rializability is based on conflict graphs. Our method is a modification of this 
traditional method and does not use any notions from graph theory. We believe 
that our method is logically more simple and straightforward, and therefore more 
appropriate for mechanical verification. This makes it possible to efficiently im- 
plement our method in PVS. 

Our method is based on the notion of conflict-preserving timestamps ( CPT). 
We formulate a condition for schedules to be conflict serializable using an assign- 
ment of timestamps to transactions which orders conflicting transactions (two 
transactions are conflicting iff at least one of them contains a write to a com- 
mon data item). We prove that this condition is necessary and sufficient. Hence, 
to show that a protocol ensures conflict serializability, we must prove that any 
schedule accepted by this protocol satisfies the condition. This implies then that 
the protocol indeed ensures conflict serializability, as well as the weaker notion 
of view serializability. 

5) Extensions and correctness conditions. Suppose some basic protocol, 
for instance the 2PL protocol, has been proved correct. Adding more control 
information and more actions, we obtain various extensions of this protocol. We 
show that serializability of these extensions is ensured by four simple correctness 
conditions. The proof that these conditions lead to serializable protocols is far 
from trivial, but has to be done only once. By applying the resulting extension 
scheme, we easily obtain protocols that are serializable by construction. 

As an example, we consider the 2PL protocol and several layered extensions. 
In the basic protocol, serializability is ensured by locking and unlocking data 
items. The first extension adds sequences of transactions, waiting for data items 
to become available (i.e., unlocked) . The second extension gives priority to urgent 
transactions, resulting in a more realistic protocol. Since we formally verified the 
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correctness of the 2PL protocol, a simple check of the four conditions leads to 
the correctness of these new protocols. 

Structure of this paper. This paper is organized as follows. In section 2, 
we provide a general specification pattern and apply it to the specification of 
the 2PL protocol. In section 3, the notions of conflict and view serializability are 
formalized. We prove that conflict serializability implies view serializability. In 
section 4, our verification method is presented and its soundness and complete- 
ness are shown. The method has been applied to verify the 2PL protocol and the 
Timestamp Ordering protocol. In section 5, we formalize extensions of protocols 
and the restrictions on these extensions, needed to ensure their correctness. In 
section 6, we apply our method to specify and verify several layered extensions 
of the 2PL protocol. Section 7 contains some concluding remarks. 

2 Specification of Protocols 

We consider protocols in which transactions perform atomic actions on certain 
data items. Two basic actions are common for such database protocols: read 
and write, which are the only actions that concern the values of the data items. 
Additionally, there are usually other actions, necessary for the concurrency con- 
trol. The set of actions of a database protocol is defined by type ActionNames, 
containing at least read and write actions (denoted as R and W). In the PVS 
notation (henceforth written in typewriter style): 

R, W : ActionNames 

The set of data items is defined by uninterpreted type Variables; the set of 
transactions is defined by type Transactions, representing the names of trans- 
actions. Moreover, we define a type Actions consisting of records with three 
fields, called act, tr, and vari, expressing that a particular action is performed 
by a transaction on a data item. 

Actions : TYPE = [# act : ActionNames, tr : Transactions, 
vari : Variables #] 

E.g., (W, T, x) represents a write action by transaction T on variable x. 

Concurrency control protocols maintain a control part to determine which 
actions on data items are allowed and which are not allowed in a particular state 
of a database. E.g., the control part for lock-based protocols determines which 
data items are locked and in which mode (shared or exclusive). The control part 
for timestamp-based protocols contains information about timestamps of data 
items. In PVS, the control part for database protocols is defined by type States. 

Each action causes certain changes in the control part. Eor example, for lock- 
based protocols, it may lock or unlock some data items. Eor timestamp-based 
protocols, this concerns the adjustment of read- and write-timestamps of some 
data items. Therefore, we define the initial value of the control part, i.e. the 
initial state, and how the control part is changed after every possible action. We 
also have to define which actions are allowed in a particular state, and which 
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are not. E.g., a transaction cannot lock a data item in an exclusive mode if it 
is already locked by another transaction. Consequently, a database protocol is 
defined by the following steps: 

1. Define type ActionNames, containing the atomic actions R and W and pos- 
sibly some other atomic actions, responsible for the adjustment of control 
information. 

2. Define type States, containing all control information essential for the def- 
inition of the protocol, and define the initial state is. 

3. Define how a particular state is changed after applying a particular (allowed) 
action (e.g., a read or write of a data item) by means of the Effect predicate; 
a function with three arguments of types States, Actions, States, resp., 
and result of type bool. For states si and s2 and an action al, we have 
Effect (si, al, s2) = TRUE iff s2 is obtained from si by applying al. 

4. Define which actions are allowed in a particular state by the Pre predicate. 
For a state si and an action al we have Pre (si, al) = TRUE iff al is 
allowed in si. 

A finite execution is represented by a sequence r of the form sq Si ^ ... 
s„+i. Here Si (0 < i < n + 1) are states, and (0 < i < n) are actions. Infinite 
executions are represented by all finite approximations. Sequence r is a correct 
execution or run iff sq is the initial state, subsequent states are related by the 
Effect predicate, and actions are enabled, as expressed by the Pre predicate. 

In PVS, a run r is formalized as a record with two fields: StateSeq(r) is a 
finite sequence of states, and ActionSeq(r) is a finite sequence of actions, where 
StateSeq(r) has one more element then ActionSeq(r) . For the example run 
above, we have StateSeq(r) = soSi...SnSn+i and ActionSeq(r) = aoai...an- 
A finite sequence of actions is called a scAedw/e. For instance, (W, Tl, x) (W, 
T2, y) (R, Tl, y) represents an execution where first transaction Tl writes a 
data item x, then transaction T2 writes a data item y, and next Tl reads y. 

For a protocol, represented by States, is. Actions, Effect and Pre, and a 
run r of this protocol, we say that ActionSeq(r) is a schedule, allowed by this 
protocol. Given a definition by the four points mentioned above, we identify a 
protocol with the set of allowed schedules. 

protocol : setof [Schedules] = 

{ S : Schedules I EXISTS (r : Runs) : S = ActionSeq(r) } 



2.1 Example of the Two Phase Locking Protocol 

Informal description. The 2PL protocol (see, e.g., [SKS97]) requires that ac- 
cess to data items is done in a mutually exclusive manner; that is, while one 
transaction is accessing a data item, no other transaction can modify that data 
item. The most common method used to implement this requirement is to allow 
a transaction to access a data item only if it is currently holding a lock on that 
item. There are various modes in which a data item may be locked. The basic 
2PL protocol, considered in this paper, has only two modes: 
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— Shared. If a transaction T has obtained a shared-mode lock on item x, then 
T can read, but cannot write, x. 

— Exclusive. If a transaction T has obtained an exclusive- mode lock on item 
X, then T can both read and write x. 

Let A and B represent arbitrary lock modes. Suppose that transaction T2 requests 
a lock of mode B on item x on which transaction T1 (T1 T2) currently holds 

a lock of mode A. If T2 can be granted a lock on x immediately, in spite of the 
presence of the mode A lock, then we say that mode B is compatible with mode 
A. In the 2PL protocol, shared mode is compatible with shared mode, but not 
with exclusive mode; exclusive mode is not compatible with both shared and 
exclusive modes. 

To access a data item, transaction T must first lock that item in the cor- 
responding mode. If the data item is already locked in an incompatible mode, 
the request to lock this item is rejected. The 2PL protocol requires that each 
transaction issues lock and unlock requests in two phases: 

— Growing phase. A transaction may obtain locks, but may not release any 
lock. 

— Shrinking phase. A transaction may release locks, but may not obtain any 
new locks. 

Initially, a transaction is in the growing phase. The transaction acquires locks as 
needed. Once the transaction releases a lock, it enters the shrinking phase, and 
it can issue no more lock requests. 

PVS implementation. We specify this protocol, following the four steps 
mentioned above. The Effect2PL predicate and the Pre2PL predicate are not 
shown here. 

— ActionNames. In our model, locking is incorporated in read and write actions, 
and hence does not require a separate action. We only add an unlock action 
to unlock a data item which is locked in a shared or exclusive mode and 
a downgrade action which changes the mode of the lock from exclusive to 
shared. 

ActionNames2PL : TYPE = { R, W, unlock, downgrade }• 

— We define States2PL by a record with three fields, xset and sset map each 
transaction to a set of data items which it locks in an exclusive and shared 
mode, respectively, shrinking is a set of transactions which already entered 
the shrinking phase and therefore cannot issue any new locks. 

States2PL : TYPE = 

[# xset : [Transactions -> setof [Variables] ] , 
sset : [Transactions -> setof [Variables] ], 
shrinking : setof [Transactions] #] 



In the initial state, is2PL, all the data items are unlocked and no transaction 
is shrinking. 
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3 View and Conflict Serializability 

To define view serializability, we first define view equivalence between schedules, 
following [SKS97]. Consider two schedules SI and S2, where the same set of 
transactions participates in both schedules. 

Definition 1. The schedules SI and S2 are view equivalent if the following three 
conditions are met: 

1. For each data item x, if transaction T1 reads the initial value ofx in schedule 
SI then, in schedule S2, transaction T1 must also read the initial value of x. 

2. For each data item x, if transaction T1 reads a value of x in schedule SI and 
the value was produced by transaction T2 then, in schedule S2, transaction 
T1 must also read the value of x that was produced by transaction T2. 

3. For each data item x, the transaction T1 (if any) that performs the last write 
action on x in schedule SI, must also perform the last write action on x in 
schedule S2. 

Conditions 1 and 2 ensure that each transaction reads the same values in both 
schedules and, therefore, performs the same computation. Condition 3, coupled 
with conditions 1 and 2, ensures that both schedules result in the same final 
system state. 

The definition of view equivalence can be presented in a more formal way 
using the notion of a reads-from relation [Vid91]. We associate with each schedule 
S a reads-from relation Reads_£rom(S) , not shown here, relating a transaction 
that read a value of an item and the transaction that wrote this value. Then 
view equivalence can be defined as follows. 

Definition 2. (Equivalent to 1.) The schedules SI and S2 are view equivalent 
if their reads-from relations are equal: 

view_equiv(Sl , S2) : bool = (Reads_from(Sl) = Reads_from(S2) ) 

As we mentioned in introduction, a schedule is serial, in PVS represented by 
predicate serial (S) , if it has no interleaving between actions of different trans- 
actions. For instance, schedule (W, T2, y)(W, Tl, x) (R, Tl, y) is serial, be- 
cause an action by T2 precedes both actions by Tl. Schedule (W, Tl, x) (W, 
T2, y) (R, Tl, y) is not serial, because two actions by Tl are interleaved by 
an action by T2. 

A schedule S belongs to the set of view serializable schedules, denoted by 
View_serializable, iff it is view equivalent to a serial schedule. 

View_serializable : setof [Schedules] = 

{ S I EXISTS SO : serial (SO) AND view_equiv (S , SO) } 

Next we explain the notion of conflict equivalence. Suppose S includes two con- 
secutive actions al = (Al, Tl, x) and a2 = (A2, T2, y) , where A1 and A2 
belong to { R, W }. Thus S = SI al a2 S2 for some subschedules SI and S2. 
As explained in [SKS97], the order of al and a2 does not influence the result 
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of computation if either x y or (x = y and A1 = A2 = R). If x = y and (A1 
= W or A2 = W), then the order of al and a2 matters, i.e. changes the result 
of computation. Observe that T1 = T2 is allowed, assuming that actions of a 
transaction are partially ordered rather than totally ordered as in [SKS97]. 

Definition 3. The actions (Al, Tl, x) and (A2, T2, y) are conflicting ijfx 
= y and (Al = W or A2 = Wj. 

Definition 4. The schedules SI and S2 are elementary equivalent ijf SI = S3 
al a2 S4, S2 = S3 a2 al S4 and the actions al and a2 are not conflicting. 

Definition 5. The schedules SI and S2 are conflict equivalent, denoted 
conf _equiv(Sl , S2) iff there is a finite sequence of schedules S_0, S_l, . . .S_k, 
k >= 0, such that SI = S_0, S2 = SJs and for aZ/ i < k the schedules S_i and 
S_(i + 1) are elementary equivalent. 

A schedule S belongs to the set of conflict serializable schedules, denoted by 
Conf .serializable, iff it is conflict equivalent to a serial schedule. 

Conf .serializable : setof [Schedules] = 

{ S I EXISTS SO : serial (SO) AND conf _equiv (S , SO) > 

Since swaps of nonconflicting actions do not change the result of computation, 
we ean expect that they do not change the reads-from relation as well. Indeed, 
we have proved in PVS theorem ConfView, expressing that conflict equivalent 
schedules SI and S2 are also view equivalent: 

ConfView : THEOREM Conf _equiv(Sl , S2) IMPLIES View_equiv(Sl , S2) 

4 Our Method of Verification 

We present a general method for mechanical verification of conflict serializabil- 
ity. Our approach is a modification of a traditional method for proving conflict 
serializability based on conflict graphs. We do not use graphs, but do need a 
notion of conflicting transactions which is defined as a conflict relation. 

Definition 6. A conflict relation Conflict (S) of a schedule S is defined as 
follows: a pair (Tl, T2) belongs to Conflict (S) iffTl yf T2 and 

— S includes actions al and a2 by Tl and T2 respectively 

— al precedes a2 in S 

— al and a2 are conflicting. 

It is well-known (although not mechanically verified) that a schedule S is conflict 
serializable iff the relation Conflict (S) , considered as a graph in which nodes 
are transactions, is acyclic. Our method does not use graph theory, but assigns 
timestamps to transactions, using an irreflexive order on timestamps. A time 
domain Time is some domain with a transitive, irreflexive order. For instance, the 
set of natural, rational or real numbers with the conventional order. A timestamp 
TS is a function from Transactions to Time. 
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Our method is based on the notion of conflict-preserving timestamps (CPT). 

CPT(S, TS) : bool = FQRALL Tl, T2: Conflict (S) (T1 , T2) IMPLIES 

TS(Tl) < TS(T2) 



Definition 7. A timestamp TS is a conflict-preserving timestamp (CPT) with 
respect to schedule S ijff CPT(S, TS) = TRUE. 

If a schedule S has a CPT then the transitive closure of Conflict (S) is irreflex- 
ive, because < is an irreflexive order on Time. 

A schedule S belongs to the set of ordered schedules Ordered iff there is a 
timestamp TS which is conflict-preserving with respect to S. 

Ordered : setof [Schedules] = { S I EXISTS TS : CPT(S, TS) > 

We proved that any ordered schedule is conflict serializable, and any conflict 
serializable schedule is ordered. The proof has been constructed by means of the 
interactive proof checker of PVS and is technically fairly complicated. 

OrdSerializable : THEOREM Ordered = Conf _serializable 

Theorem OrdSerializable provides a basis for a sound and complete method 
for proving serializability. Given a particular protocol, we prove that each sched- 
ule allowed by this protocol is ordered, i.e. has a conflict-preserving timestamp. 
Thus, for a particular protocol, the aim is to prove the following theorem. 

ProtocoIDrdered : THEOREM subset? (protocol , Ordered) 

After that, theorem OrdSerializable implies that protocol indeed ensures 
conflict serializability: 

ProtocoICS : THEOREM subset? (protocol , Conf _serializable) 

We successfully applied our method to the machine-checked verification of the 
Timestamp Ordering protocol and the 2PL protocol. 

5 Extensions of (Serializable) Protocols 

Although we have formulated in the previous section a complete method to prove 
conflict serializability, it is not always easy to find a conflict-preserving times- 
tamp function for any schedule (and to prove that it actually is one). Observing 
that many protocols can be seen as extensions of a basic protocol (such as Times- 
tamp Ordering or 2PL), we investigate how we can obtain serializability of an 
extension from serializability of a basic protocol. First we define the notion of 
an extension more precisely. 

We say that protocol NewProt is an extension of protocol OldProt iff 

— OldActionNames, the set of atomic actions of OldProt, is a subset of 
NewActionNames, the set of atomic actions of NewProt. 
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— NewStates, the control part of NewProt, is obtained from OldStates, the 
control part of OldProt, by adding a record ext of type Extension, repre- 
senting the added control information: 

NewStates : TYPE = [# old : OldStates, ext : Extension #] 

Our goal is to prove that if OldProt ensures conflict serializability and extension 
NewProt satisfies certain conditions, then NewProt also ensures conflict serializ- 
ability. Below we derive the required conditions during the construction of the 
proof. 

Let Conf_ser_01d and Conf_ser_New be instantiations of set of schedules 
Conf .serializable for schedules from OldProt and NewProt, respectively. Our 
aim is to prove the following theorem. 

MainTheorem : THEOREM subset? (OldProt , Conf _ser_01d) IMPLIES 

subset? (NewProt , Conf _ser_New) 

Proof: Suppose OldProt ensures conflict serializability and schedule NewS is 
accepted by NewProt. The proof that NewS is conflict serializable consists of two 
steps. 

Step 1 We prove that NewS is a refinement of some schedule OldS, accepted by 
OldProt, i.e. it is obtained from OldS by adding some actions. To construct OldS, 
we simply remove from NewS all added actions, i.e. all actions that do not occur 
in OldActionNames. The result is formally defined by function Extract (NewS) . 
Note that we don’t remove any read or write actions, because R and W belong 
to OldActionNames. The following theorem expresses that Extract (NewS) is 
accepted by OldProt. 

ExtractOld : THEOREM NewProt (NewS) IMPLIES OldProt (Extract (NewS) ) 

As we show below, the proof of this theorem reveals the required correctness 
conditions. Since OldProt is conflict serializable, theorem ExtractOld implies 
Conf _ser_01d(Extract (NewS) ) . 

Step 2 If Extract (NewS) is conflict serializability, then also NewS: 

ConfNewOld : THEOREM Conf _ser_01d (Extract (NewS) ) IMPLIES 

Conf _ser_New (NewS) 

The proof of this theorem uses completeness of our verification method for con- 
flict serializability. Since it implies Conf _ser JJew(NewS) , this completes the proof 
of theorem MainTheorem. End Proof 

It remains to prove theorem ExtractOld and to derive the required correctness 
conditions. 

Proof of theorem ExtractOld 

Assume NewProt (NewS) . Then there exists a run NewR = 50-^51^ ... Sn 
s„+i of NewProt such that NewS = ActionSeq(NewR) , i.e. NewS = aoai...an- Let 
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be the sequence obtained from agai.-.a^ by removing all actions that 
are not in OldActionNames, i.e., Extract (NewS) = aQa[...a'f,. 

To prove OldProt (Extract (NewS) ) , we construct a run Sq -3- ^ 

of OldProt. This run is extracted from run NewR by the function ExtractR which 
removes from a run of NewProt any action that is not in OldActionNames and 
its successor state. Moreover, we take only the old part of the remaining states. 
Since Extract and ExtractR both remove the same actions (those with action 
names not in OldActionNames), observe that ActionSeq (ExtractR (NewR) ) = 
Extract (ActionSeq(NewR) ) = Extract (NewS) = Hence, it remains 

to prove that OldR = ExtractR(NewR) is a run of OldProt. 

For any run r, let last(r) denote the last state of r. Instead of proving, 
that ExtractR (NewR) is a run in OldProt, it is more convenient to prove the 
following, stronger statement, consisting of two parts: 

(i) ExtractR (NewR) is a run of OldProt and 

(ii) last (ExtractR(NewR)) = old (last (NewR) ) . 

The proof proceeds by induction on the length of ActionSeq(NewR) . 

Basic Step Let length.(ActionSeq(NewR) ) = 0. Then NewR = NewInitState 
and, by definition of ExtractR, ExtractR (NewR) = old(NewInitState) . 
Hence, ExtractR (NewR) is a run if old (NewInitState) is equal to the initial 
state of OldProt. Then also (ii) is satisfied. This leads to the first condition. 

Condition 1 old(NewInitState) = OldInitState 
Induction Step Let length.(ActionSeq(NewR) ) = m + 1. Then NewR = 
NewRl^last (NewR) for some run NewRl. We distinguish two cases. 
act{a) ^OldActionNames Then ExtractR (NewR) = Extract (NewRl) . For 
part (i), recall that by the induction hypothesis ExtractR(NewRl) , and 
hence also ExtractR (NewR) , is a run of OldProt. 

For (ii), note that last (ExtractR (NewR) ) = last (ExtractR(NewRl) ) 
= olddast (NewRl) ) , using the induction hypothesis. To obtain 
olddast (NewRl) ) = olddast (NewR) ) , we introduce a condition ex- 
pressing that if we apply a newly added action aa to an extended state 

esl, then the old part of it should not change. 

Condition 2 

NewEff ect (esl , aa, es2) IMPLIES old(esl) = old(es2) 
act{a) GOldActionNames By definition of ExtractR, we have in this case 
ExtractR (NewR) = Extract (NewRl) A olddast (NewR) ) . 

By the induction hypothesis, part (ii), we have 

last (Extract (NewRl) ) = olddast (NewRl) ) . (*) 

To prove (i), note that ExtractR (NewR) is a run of OldProt if the fol- 
lowing two conditions are satisfied. 

— a is allowed in the last state of Extract (NewRl) , that is, 

OldPre (last (Extract (NewRl) ) , a) = TRUE. 

By (*), it remains to prove OldPre (olddast (NewRl) ) , a) = TRUE. 
Since a is allowed in the last state of NewRl, we have that 
NewPre (last (NewRl) , a) = TRUE. Hence it is sufficient to require 
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that any old action oa which is allowed in an extended state es 
according to NewPre, is also allowed in the old(es) according to 
OldPre. 

Condition 3 NewPre (es , oa) IMPLIES 01dPre(old(es) , oa) 

— oldClast (NewR) ) is obtained from last (Extract (NewRl)) by ap- 
plying a to it, i.e. 

OldEff ect (last (Extract (NewRl) ) , a, old (last (NewR) ) ) = TRUE. 
By (*), it remains to prove 

OldEffect (olddast (NewRl) ) , a, old (last (NewR) ) ) = TRUE. 
Since last (NewR) is obtained from last (NewRl) by applying a to it, 
we have NewEff ect (last (NewRl) , a,last(NewR)) = TRUE. Hence 
it is sufficient to require, for any old action oa, that NewEff ect must 
transform the old part of an extended state esl in the same way 
OldEffect does. 

Condition 4 NewEff ect (esl , oa, es2) IMPLIES 

OldEff ect (old(esl) , oa, old(es2)) 

This proves (i) . To prove (ii) , observe that by the definition of ExtractR, 
in this case 

last (ExtractR(NewR) ) = olddast (NewR) ) . 

This completes the induction step and also the proof of Extract Old. End Proof 

To implement extensions in PVS, we define a general PVS theory ProtExtend. 
As parameters, it has all types and predicates that are needed to define OldProt 
and NewProt . Theorem MainTheorem, which establishes the main result, is proved 
in ProtExtend. The conditions 1 through 4 mentioned above are added to this 
theory by including them as four assumptions. If any theory imports ProtExtend 
then a proof of these assumptions is required. 

Given a conflict serializable protocol OldProt we can prove serializability of 
an extension NewProt, by importing theory ProtExtend. This requires a proof 
of the four assumptions. Once they have been proved, we can use MainTheorem, 
and obtain conflict serializability of NewProt. 

6 Two Extensions of the 2PL Protocol 

We have applied our method to the basic 2PL protocol, described in section 2. 
This protocol is extended in two steps, leading to a realistic protocol which is 
serializable by construction. 

First extension — adding a seqnence of waiting transactions. In the 

first step, we associate with each data item a sequence of transactions that are 
waiting for the permission to read or write this data item. If a transaction is 
not allowed to read or write a data item x immediately (because it is currently 
locked in an incompatible mode), the corresponding action is inserted into the 






192 



Dmitri Chkliaev, Jozef Hooman, and Peter van der Stok 



sequence of x. After x becomes available, a postponed action from the sequence 
of X may be executed. 

The operation of inserting an action into a sequence is modeled by read- 
request aetions (Rrequest) and a write-request actions (Wrequest). The exten- 
sion of the state consists of a function that maps each data item to a finite se- 
quence, consisting of read- and write-requests performed by certain transactions; 
in an initial state all sequences are empty. A new effect predicate transforms the 
state in the same way Effect2PL does for old actions, it leaves the old part 
of the state unchanged for added actions, and includes an additional predicate 
to define how to insert and remove requests from the waiting sequences. A new 
precondition ensures that not only preconditions defined by Pre2PL are satisfied, 
but also some additional preconditions. 

Second step — adding priorities to waiting transactions. We define a 
second-level extension of the 2PL protocol by extending the first-level extension 
above such that the processing of transactions depends on their priorities. A 
priority function PR assigns to each transaction T its priority PR(T) from the set 
of natural numbers. 

We also introduce the notion of urgent transactions, which is important for 
real-time protocols. Assume given a natural number U. Transaction T is called 
urgent with respect to U, if PR(T) >= U. We define our protocol in a new theory, 
such that its set of parameters includes PR and U. Changing PR and U, we obtain 
different protocols. Therefore our theory actually defines a class of protocols. 

This extension does not introduce any new control information or any new 
actions. Instead, it introduces some restrictions on the order, in which transac- 
tions are performed. The aim of these new restrictions is to ensure that “urgent” 
transactions obtain immediate access to data items, whereas that non-urgent 
transactions should be served on a first-in, first-out basis. 

Suppose a data item x has a sequence xs. We define a predicate urgent_exist, 
which expresses that xs includes requests from urgent transactions. If 
urgent_exist (xs) = TRUE, then we must execute one of the urgent transac- 
tions with the highest priority MaxPriority (xs) . Otherwise, we may execute 
the first-inserted request of the waiting sequence. 

Correctness of the obtained extensions. After importing the theory 
ProtExtend with corresponding parameters for both protocols, it turned out to 
be very easy to prove that our four assumptions are satisfied for both protocols. 
Therefore our extensions indeed ensure conflict serializability. 

Note that one may satisfy the conflict serializability condition by not allowing 
any schedule. Therefore, we additionally show that for every valid schedule in the 
initial protocol there is a representative in the extended protocol. For the first 
extension of the 2PL protocol presented above, it is easy to see that for every 
schedule S in the 2PL protocol there is a representative S' in the extension, 
which consists of the same actions. Let S' be a schedule where a transaction 
never tries to read or write a data item if it is not immediately available; then 
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all sequences of requests are always empty and S' is indeed accepted by the 
extension. The same holds for the second extension of the 2PL protocol. 

7 Concluding Remarks 

We have presented a formal framework for the specification of concurrency con- 
trol protocols and the verification of serializability, and successfully applied it 
to the verification of the 2PL protocol and the Timestamp Ordering protocol. 
Mechanical support has been obtained by formulating this framework in the lan- 
guage of the verification system PVS, and all proofs have been constructed by 
means of the interactive theorem prover of PVS. 

Moreover, a systematic way to extend serializable concurrency control pro- 
tocols has been developed. If such an extension satisfies four simple verification 
conditions, it is serializable by construction. This can be applied in a hierarchi- 
cal way, thus complex protocols can be obtained by a sequence of extensions of 
a basic concurrency control protocol. An old, serializable, protocol can be ex- 
tended to a new protocol by adding more control information to the state and 
introducing additional control actions. One has to define the new initial state, 
a new precondition for all actions and a new effect predicate which describes 
the state change after each action. Then the new protocol is serializable if the 
following conditions are satisfied. 

1. Ignoring the added control part, the new initial state equals the old initial 
state. 

2. A new action only affects the added part of the state; it does not change the 
original part of the state. 

3. The new precondition of an old action implies its old precondition. 

4. The new effect of an old action implies its old effect. 

There are several directions for future work. We intend to investigate more 
protocols and develop more detailed strategies for their verification. We may 
also add timing, i.e. extend our method to real-time database protocols. Another 
possibility is to study not only serializability for databases, but also more general 
protocols and correctness notions such as atomicity of transactions. 
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Abstract. This paper presents the platform independent approach to 
detecting shared memory parallelism. The brief overview of Automatic 
Parallelizing Expert Toolkit being developed and the description of basic 
concepts used by this toolkit are given. 



1 Introduction 

Known approaches to porting existing serial programs onto parallel platforms 
could be divided into two groups: 

— using of automatically parallelizing compilers [1,2]; 

— adding to the source codes of serial program special directives, which explic- 
itly specify the actions to be taken by the compiler and run-time system in 
order to execute the program in parallel. 

Both of these approaches have some shortcomings. 

Parallelizing compilers usually does not detect all the regions where par- 
allelization is possible. Moreover, it eould be uneasy task to determine rather 
compiler detect the parallelism or not, and why. Adding new parallelizing tech- 
niques to such compilers is up to vendor, so developer cannot rely on soon release 
of techniques needed. This approach also led to portability problems — different 
compilers can be significantly different in parallelization quality. 

Explicit specification of all the necessary parallel regions may be even more 
complex task. Though the resulting program would probably work better this 
is not the case of program reuse — this approach is comparable to writing 
completely new code. 

The way out from sueh situation could be found in using restructuring tools 
that automatically inserts parallelization directives into ordinary serial source 
code; the possibility of adding new parallelization techniques by tool’s user 
should be present. 

This approach has the following obvious advantages: 

— the output of such tool is a meaningful souree code thus developer understand 
clearly what parallelization has made; 
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— appearance of such directive sets as OpenMP API [3] solves a problem of 
cross-platform portability within a class of SMP platforms; 

— the tool based on expert systems technology can explain all the parallelizing 
actions made; 

— once an existing set of techniques does not satisfy a particular developer it 
is possible to expand this set with new ones. 

It is clear that such tool needs powerful models of both the ’’parallel pro- 
gram” and the ’’parallelization technique”. This paper presents these models 
targeted on implementation of Automatic Parallelizing Expert Toolkit (APET) 
and provides toolkit architecture overview. 

2 Parallel Program and Execution Models 

APET utilizes two models (the single at the same time) of parallel program: 

— Extension of Model of Structured Program (MSP [4]), called Model of Par- 
allel Program (MPP [5]). This model represents a parallel program in n- 
processor SMP-system as n serial programs with common data space and 
additional synchronization points, so-called barriers. 

— Open-MP compliant Pork-Join Model, PJM. In this model program begins 
execution as a single thread of execution called master thread. The master 
thread executes as a serial region until the parallel construct creates a team 
of threads, which executes in parallel. Upon completion of parallel construct, 
the threads in the team synchronize at an implicit barrier, and only the mas- 
ter thread continues execution. Work sharing directives, nested parallelism 
and orphaning is permitted. 

It is shown that for a great enough n the conformity between these models 
exists. The MPP is quite a simple model to operate, but APET’s output in this 
model can be used only as an illustration of toolkit’s actions. On the contrary, the 
PJM is much more complex; APET’s output using this model can be interpreted 
as C or EORTRAN program with Open MP compliant extensions and than 
immediately be compiled under the most of the popular SMP-platforms. 

3 Parallelizing Technique Model 

Parallelizing technique in APET is represented in three parts: 

— Condition — logical expression in terms of MPP or PJM. The Parallelizing 
technique is possible to apply if and only if this condition is true. 

— Subset of variables defined in Condition. These variables are the parts of 
the program, which will change due to the given technique, so-called ’’Par- 
allelization Region”. Note that the value of these variables is one or more 
consequent operators and/or directives. 

— Parallelizing Transformation — the set of expressions determining the new 
value of ” Parallelization Region” . 
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4 Overview of Automatic Parallelizing Expert Toolkit 
Architecture 

APET’s input is a model of serial program (MSP) obtained by specialized com- 
piler from high-level languages and the knowledge base of parallelization tech- 
niques. The output is MPP or FJM model that represents the source program 
after all the parallelization techniques possible applied. Both of the models can 
be converted back into the high-level language (not necessary the same as the 
source) form. APET also provides an extensive report explaining which of the 
parallelization techniques were (or were not) applied and why (see figure 1). 
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Fig. 1. APET’s architecture diagram 








Platform Independent Approach for Detecting Shared Memory Parallelism 197 



It is also possible to define the set of criteria of parallelization quality to mea- 
sure some static values of parallelization made by selected set of parallelization 
techniques. 
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Abstract. We suggested an extension of the class of cause-effect struc- 
tures by semantics of hierarchy. As an example of hierarchical c-e struc- 
ture we use a simulation of zero-testing operator. Relationships between 
classes of hierarchical c-e structures and hierarchical Petri Nets intro- 
duced by V.E. Kotov are investigated. 



1 Introduction 

In order to describe concurrent systems, L.Czaja has introduced in [1] cause- 
effect structures (CESs) which were inspired by condition /event Petri nets (PNs). 
CES can be defined as a triple (X, C, E) where X is the set of nodes, C and E 
are the cause and effect functions from X to the set of formal polynomials over X 
such that X £ X occurs in C{y) iff y occurs in E{x). Each polynomial C{x){E{x)) 
denotes a family of cause (effect) subsets of the node x. The operator * combines 
nodes into subsets, and the operator + combines subsets into families. 

Unfortunately, practical expressiveness of CESs is not sufficient to use them in 
real-life applications. Some supplementary constructions, for instance, semantics 
of coloured tokens or hierarchy are necessary. 

Note that the extension of CESs by coloured tokens has been received in [7]. 
In ordinary CESs a token or an active state of a node denotes presence of some 
resource. However this approach does not allow qualitative difference between 
resources functioning in CES to be discovered. Moreover, each node should not 
simultaneuosly have more than one token-resource. Sometimes it is important to 
differ resource qualitatively. This difference is represented by colours of tokens, 
and each node may have several differently coloured tokens. 

This work is devoted to constructing a class of hierarchical CESs (HCESs) 
which improves compactness of the algebraic representation of CESs and enlarges 
their practical expressiveness. 

Relationships between this new class and the class of hierarchical Petri nets 
(HPNs) introduced by V.E. Kotov in [3] are investigated. We prove that every 
HCES has behaviorally equivalent HPN. 

There was an interesting open problem: in [4] Raczunas investigates converse 
mapping from PNs to CESs. He remarks that so called strong equivalence is 
not the case for converse mapping. We decided this problem in [6] by introduc- 
ing extension of cause-effect structures - two-level CESs (TCESs). TCESs is 



D. Bj0rner, M. Broy, A. Zamulin (Eds.): PSI’99, LNCS 1755, pp. 198—207, 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 




Hierarchical Cause-Effect Structures 



199 



a convenient intermediate class between PNs and CESs, because it is strongly 
equivalent to the class of PNs and we can transform any TCES into structurally 
equivalent CES with the help of folding-transformation. On the other hand, each 
CES has a strongly equivalent TCES. 

The problem of the converse mapping from HPNs to HCESs is decided with 
help of the class of two-lewel HCESs. 



2 Preliminaries 

2.1 Regular and Hierarchycal Petri Nets 

The algebra of regular Petri nets (RPN) introduced in [2] is generated by the 
class of atomic nets with the use of the set of net operations. 

An atomic net is a net of the following form: 



O 




where is a transition symbol, “is a head net place, “ is its tail place. 

The concurrency operation (denoted by is defined as a common graph 
union: it superposes one net on another. 

If Ni = (Pi,Ti,Fi) and N 2 = (P 2 ,T 2 ,P 2 ) then 
N = (Ni, N 2 ) = (Pi U P 2 , Ti U T 2 , Pi U P 2 ) 

Let h{N) denote the set of head places of a net N and 1(N) be the set of tail 
places of N. By definition, h{N) = h{Ni) U h{N 2 ) and 1{N) = l{Ni) U 1{N2). 

Other net operations can be defined via the concurrency operation and an 
auxiliary merging operation. The latter merges two sets of places in a specific 
way. This involves two suboperations: 1) formation of a set of merged places, 2) 
replacement of two existing sets by a new set. 

Given two sets of places X and Y the forming operation x results in the set Z 
of merged places: 

Z = X xY = {x\Jy\x e X,y eY} 

The merging operation M merges two sets of places, X and Y, in a net 
N = {P,T,F) and generates a new net M{N,X x F) = {P',T',F'), where 
P' = P-{XUY)U{X X F), T' = T, 

WpeX xY : F'{p) = F{x) U P(y), 
where p = xUy. 

The operation of iteration ” merges the sets of head and tail places of the 
net if their intersection is empty: 

N' = *{N) = m{N, h{N) x 1{N)) 

By definition, its sets of head and tail places are equal. 

The precedence operation joins two nets by merging the set of tail places 
of the first net with the set of head places of the second net. By definition, 
h{N) = h{Ni) and 1{N) = I {N 2 ). 
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The alternative operation ” V” unites two nets by merging their sets of head 
places and their sets of tail places separately. By definition, h{N) = h{Ni) x 
h{N 2 ) and 1{N) = l{Ni) x 1{N2). 

Let E he a, class of atomic nets, i.e. a class of transition symbols. A net 
formula in the algebra of RPN over basis E is defined as follows: 

1) each symbol of is a formula; 

2) if A is a formula, then *(A) is a formula; 

3) if A and B are formulae, then {A, B), (A; B) and {AVB) are formulae. 



The class of hierarchical Petri nets (HPN) introduced in [3] is a generalization 
of the class of RPN and is used for modelling hierarchical systems. 

To define HPN, we should divide the class of transition symbols into two 
nonintersecting subclasses: terminal and nonterminal symbols. Correspondingly, 
any transition can be simple or compound. 

HPN is defined by a structural formula constructed from terminal and non- 
terminal symbols using the set of the (regular) net operations and an ordered 
set of nonterminal symbols’ definitions. 

Each such definition looks like s : A, where s is a nonterminal symbol, and 
A is a formula of HPN which is internal for this symbol. 

We have two contextual restrictions: 

1) Any symbol of a structural formula is terminal if it is not defined in this 
formula. 

2) Each nonterminal symbol is defined only once and it can not join the right- 
hand part of its definition and all the following ones. 

A compound transition may be in a passive or active (when its internal net 
is working) state. The begining and the end of a compound transition’s work are 
momentary events. 

On Fig.l you can see HPN which allows us to check whether the place x has 
token. Thus, expressive power of the class of HPNs is greater than of the class 
of PNs. 
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b X 




2.2 Cause-Effect Structures 

Cause-effect structures are represented as directed graphs with an additional 
structure imposed on the set of nodes. These graphs, with operations + and 
corresponding to nondeterministic choice and parallelism, constitute a near- 
semi-ring where ’’near” means that distributivity of * over + holds conditionally. 

A CES is completely represented by the set of annotated nodes: each node 
X is subscribed by a formal polynomial E{x) built of (names of) its successors 
and superscribed by a formal polynomial C{x) built of its predecessors and may 
be either in an active or passive state. The active state of a node represents the 
presence of control in it. 

If a node is active, then we try to move control from it simultaneously to 
all its successors which form a product in its lower (subscript) polynomial - if 
they are passive. Symmetrically, if a node is passive, then we try to move control 
to it simultaneously from all its predecessors which form a product of its upper 
(superscript) polynomial - if they are active (if no predecessors or successors 
exist then the upper or lower polynomial is 0, omitted sometimes). This rule 
renders complex - in general - interdependences between nodes in the aspect of 
a control flow: a group of nodes have to ’’negotiate” the possibility of changing 
their state with one another. Such groups of nodes will play a role similar to 
that of transitions in PNs. They are called firing components. The set of all 
firing components of the CES U is denoted by FC[U]. 
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All the formal definitions of CESs can be found in [1], Moreover, for better 
understanding and convenience of comparison, we include all these definitions 
to Sect. 3: Definitions 3.1, 3.2, 3.3, 3.6, 3.7, 3.8 and 3.10 without any changes; 
Definition 3.5 modifed a little for the case of hierarchy, and Definition 3.12 with 
additional (second and third) alternative groups of conditions. 



3 Hierarchical Cause-Effect Structures 

We construct an extension of the CESs class up to hierarchical one in the manner 
proposed by V.E. Kotov for Petri nets. We have compound transitions in the 
hierarchical Petri nets (HPNs), but CESs have only one type of vertices - nodes. 
The attempt to introduce compound nodes leads to complicated and unwieldy 
construction. 

Our solution is to introduce compound or ’’global” tokens which appear on 
some directions of moving control between nodes. Each such token includes an 
inner CES. When a group of nodes is ready to move control jointly to their 
successors and at least one of them gives birth to a global token on this direction, 
then: 

- this group of nodes move control to CES which is inner for appearing global 
token; 

- the inner CES is working and when it reaches the final state it moves control 
to all successors of the original group of nodes. 

In fact we have some subclass of hierarchycal or global firing components. 
This way allows us to preserve the style and the scheme of defining CESs. We 
add only a globalization function on the set of nodes, and extend semantics 
corresp ondingly. 

Definition 1. (3.1.) Let X be a set ealled a space of nodes and let 0 be a symbol 
called neutral. The least set Y satisfying the following: 6 E Y, X C Y, if K € Y 
and L e Y then {K + L) G Y and {K * L) G Y, is a set of polynomials over X 
denoted by F[X]. 



Definition 2. (3.2.) We say that the algebraic system A = {F[X],+,*,0) is a 
near- semi-ring of polynomials over X if the following axioms hold for all K G 



F[X],Lg F[X],M G F[X],xG X : 

(+) e+ K = K + 6 = K 

(++) K + K = K 

(-h-h-h) K + L = L + K 

(+ +++) K+{L + M) = {K + L) + M 

(+*) K * {L + M) = K * L + K * M 

provided that either L = M = 0 or L ^ 0 



{*) 


0* K = K *0 = K 


{**) 


X * X = X 


{* * *) 


K*L = L*K 


{* * 


K*{L*M) = {K*L)*M 



and M jt 0 



Definition 3. (3.3.) Let X be a space of nodes and (F[X], +, <?) be a near- 

semi-ring of polynomials. A CES over X is a pair (C,E) of functions: 
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C : X — ^ F[X] (cause function) 

E:X — >F[X] (effect function) 

such that X occurs in the polynomial C{y) iffy occurs in E(x) (then x is a cause 
of y and y is an effect of x). The set of all CES‘s over X is denoted by CE[X], 
The CES is completely represented by the set of annotated nodes x. 



Definition 4. (S.f.) Let {C,E) is a CES over X, and let for each x E X its 
effect polynomial be transformed to the canonical form: E{x) = '^Ei{x), where 
each Ei(x) is a monomial. The globalization function G prescribes some global 
token (i.e. a token which includes an inner CES or the neutral element 0) to each 
effect direction Ei{x) of each node x. Then a hierarchical cause-effect structure 
is a triple of functions (C{X), E{X),G{< X,E{X) >)). 

The set of all HGESs over is denoted by H[], 



Remark 1. A token may be an ordinary ’’unfaced” resource denoted by 6 . That 
is, if G{< x,Ei{x) >) = 0 for some node x and direction Ei{x), then it means 
that the token on this direction is non-compound, and moving of control runs 
without delay 



Definition 5. (3.5.) Let us define the addition and multiplication of functions 
by the rules: 

(Gi + G 2 ){x) = Gi{x) + G 2 {x) and {Ei + E 2 )(x) = Ei{x) + E 2 (x), 

G{< x,Ei >) = Gi{< x,Ei >); 

(Cl * G2)(x) = Gi{x) * G2{x) and {Ei * E 2 ){x) = Ei{x) * E 2 {x), 

G{< x,Ei *E 2 >) = Gi(< x,Ei >) + C 2 (< x,E 2 >). 

Then an algebra of HCESs is obtained as follows. Let 0 : X — i F[X] be a 
constant function 6{x) = 9, let, for brevity, the HCES {0,0) be denoted by 0, and 
let + and * on HCESs be defined by the following: 

(l) 1) Cl) + (2, 2, C2) = (1 + 2, 1 + 2, C), 

(l) 1, Cl) * (2, 2, C2) = (1 * 2,1 * 2, C). 

Obviously, if H = iCi,Ei,Gi) e HCE[X]{i = 1,2), then Ui + U 2 E HCE[X] 
and Ui*U 2 E HCE[X]. 



Definition 6. (3.6.) A CES U is decomposable iff there exist CESs V,W such 
that 

0f:V^U,0yiW^Uand either U = V + W or U=V*W. 



Definition 7. (3.7.) Let U, V be CESs. V is a substructure of U iff V + Lf = U . 
Then we write V <U. SUB[U] = {V : V < [/}. Easy checking ensures that < 
is a partial order. The set of all minimal (wrt <) and 0 elements of SUBfU] 
is denoted by MINfU]. 
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Definition 8. (3.8.) For a CES U, let Q = (Cq,Eq) be a minimal substrueture 
of U sueh that for every node x in Q: 

(i) polynomials Cq{x),Eq{x) do not comprise ‘+ 

(a) exactly one polynomial, either Cq{x) or Eq{x), is 0. 

Then Q is called a firing component of U. FC^[U] = {Q <E MIN[U] : {i),{ii) 
hold} is the set of all firing components of the first level. We denote by *Q (pre- 
set of Q) the set of nodes x in Q with Cq{x) = 9, and by Q* (post-set of Q) the 
set of nodes x in Q with Eq{x) = 0. 



Definition 9. (3.9.) If G{x) 9 for any node x G* Q, then we say that a global 
firing component Q has an internal CES denoted by G{Q) = G{x). 

x€*Q 



Remark 2. Each internal CES has a set of its own firing components. Thus, there 
exist a union of sets of firing components over all internal CESs of the first level 
called a set of firing components of the second level, and so on. The full set of 
firing components of any HCES U (denoted by FG[U]) is a union of its sets of 
firing components of all levels. 



Definition 10. (3.10.) A state is a subset of the space of nodes X. A node x is 
active in the state s iff x e s and passive otherwise. 



Definition 11. (3.11.) Let us define, for each global firing component Q, two 
supplementary subsets of its nodes: an initial state of its internal HCES G{Q) 
denoted by Sq{G{Q)) and a terminal state of G{Q) denoted by Send{G{Q)). 



Definition 12. (3.12.) For Q e FC\U], let [[Q]] denote a binary relation on 

the set of all states: (s, t) G [[Q]] iff 

•Q C s, G{Q) = 9, Q* n s = 0 , t={s-*Q)uQ* 

or 

*Q C s, G{Q) ^ 9, So{G{Q)) n s = 0 , t = (s -* Q) U So{G{Q)) 
or 

Send{G{Q)) C s, Q* n S = 0 , t = {s — Send{G{Q))) U Q* . 

Semantics [[[/]] of a HCES U is a union of relations: 

[[t^]] = U [[Q]] 

Q€FC[U] 



Remark 3. Firstly, in preserving the condition Q*ns = 0 and similar, we follow 
the tradition of defining semantics of CESs that requires the artificial safety. 
That is, each node must not have more than one token. 
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Secondly, alternative groups of conditions in Def.3.12 mean the following: 

- if the firing component is not global, all nodes of its pre-set are active and 
all nodes of its post-set are passive (the requirement of safety), then control is 
moving from all nodes of its pre-set to all nodes of its post-set; 

- if the firing component is global, all nodes of its pre-set are active and all nodes 
of the initial state of its internal HCES are passive (the requirement of safety), 
then control is moving into the internal HCES,” firing” all nodes from the initial 
state; 

- if the internal HCES of the global firing component has reached the terminal 
state and all nodes of the post-set of this global firing component are passive, 
then control is moving to them. 

On Eig.2 one can see a HCES which is equivalent in a sense to HPN on Fig.l: 




Fig. 2. Zero-testing operator 

4 Relationships between HCESs and HPNs 

There is an interesting question about relationships between CESs and PNs. 
In [4] Raczunas states that every CES has a strongly equivalent PN, i.e., two 
bijections exist: between the firing components of CES and transitions of PN, 
and between nodes of CES and places of PN; moreover, the bijections must 
preserve pre- and post-sets of firing components and transitions. 

In [2] Kotov proved that each PN has a behaviorally equivalent regular PN, 
i.e., their sets of languages or traces of firing are equal. So we can formulate: 

Theorem 1. (1.) Each HCES has a behaviorally equivalent HPN. 

Sketch of a proof. In [4] Raczunas proves that each CES has strongly equiv- 
alent PN. But the only structural difference between CES and HCES is the 
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globalization function which does not touch an external cover of HCES (the 
cover is a HCES in which all global firing components are substituted by simple 
ones). The last is an ordinary CES and so it has a strongly equivalent PN which 
is the cover of some HPN. But the cover of an internal HCES of any global firing 
component also is a CES and has a strongly equivalent PN which is internal 
for corresponding global transition of this cover-HPN, and so on. Finally, with 
the help of the regularization algorithm (see [2]), we construct a behaviorally 
equivalent HPN from given set of strongly equivalent external and internal PNs. 

Raczunas investigates a converse mapping from PNs to CESs. He remarks 
that strong equivalence is not the case for the converse mapping. 

We decided this problem in [6] by introducing an extension of cause-effect 
structures - two-level CESs (TCESs). Any CES is completely represented by 
the set of annotated nodes where E(x) and C(x) are polynomials with 

operations + and We propose to exclude the operation + from the formal 
polynomials and to call the resulting elementary CES (or unalternative CES - 
UCES) a two-level CES of the first syntactic level. Elementary CESs are united 
by the operation © into the set called a two-level CES of the second syntactic 
level, or simply TCES. Thus, TCES is a set of sets of annotated nodes. 

So, the operation © is a union of sets of an upper level. It differs from the 
operation + because it does not merge elementary CESs into a set of annotated 
nodes. An operation © on the set of TCESs is a Cartesian product of sets of an 
upper level. The operation © on the set of UCESs is the same as operation * on 
the set of CESs. In its canonical form TCES is a sum of its firing components. 

TCESs is an usefull intermediate class between PNs and CESs, because it 
is strongly equivalent to the class of PNs and we can transform any TCES 
into structurally equivalent CES with the help of folding-transformation. On the 
other hand, each CES has a strongly equivalent TCES. Thus, there are only 
structural differences between TCESs and CESs, their semantics are the same. 
So the semantics of hierarchy is transferred to the class of TCES without essential 
changes. Thus, we have: 

Theorem 2. (2.) Each HPN has a strongly equivalent hierarchical TCES. 

Sketch of a proof. An algorithm of constructing of a strongly equivalent HCES 
is stage by stage: 

1. we map the cover-net of given HPN in a strongly equivalent CES which is the 
cover-structure of the constructed HCES; 

2. we map the cover-net of an internal HPN of each compound transition of the 
first level in a strongly equivalent CES which is the cover-structure of an internal 
HCES of corresponding global firing component; 

3. and so on. 

On the each stage we deal with mapping an ordinary PN in ordinary CES. But 
the algorithm of such mapping has constructed and proof of its correctness and 
fullness has made in [6]. 
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5 Conclusion 

This work is a continuation of the series of papers [5], [6], [7] devoted to eon- 
structing different extensions and generalization of the cause-effect structures. 
Moreover, the globalization function proposed in this work has more general and 
important meaning. It allows us to unite all these extensions into an universal 
model. That is, such function may prescribe internal CESs to one group of firing 
components, time restrietions to another one, and rules of token colour trans- 
formations to some other firing components. Thus, an important direction of 
future investigations is constracting a high-level class of CESs which will unite 
feasibilities and advantages of all above semantics and two-level representation. 
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Abstract. Nested Petri nets are Petri nets using other Petri nets as 
tokens, thereby allowing easy description of hierarchical systems. Their 
nested structure makes some important verification problems undecid- 
able (reachability, boundedness, . . . ) while some other problems remain 
decidable (termination, inevitability, . . . ). 



1 Introduction 

For modelling and analysis of distributed concurrent systems, there exists a large 
variety of formalisms based on Petri nets [Rei85, Jen92, Smi96, Lom97]. Among 
them, several approaches extend the Petri nets formalism by notions and struc- 
tures inspired from object oriented programming [Sib94, Lak95, MW97, Val98]. 
Such extensions are helpful for modelling hierarchical multi-agent distributed 
systems. 

While Sibertin-Blanc [Sib94], Lakos [Lak95], Moldt and Wienberg [MW97] 
consider systems with communicating coloured Petri nets, Valk [Val98] in his 
object Petri nets considers tokens as objects with a net structure. In his approach, 
the system net and object nets are elementary net systems, but an object is in 
some sense not located in one place (since Valk uses object Petri nets for solving 
specific fork -join situations in task planning systems), and this leads to a rather 
complex definition of the notion of states for object Petri nets. 

Nested Petri nets. Here we study another Petri net model where tokens may be 
nets themselves: nested ^ Petri nets [Lom98]. Nested Petri nets are a convenient 
tool for modelling hierarchical multi-agent dynamic systems. The object nets in 
a nested Petri net have their own structure and behaviour, they may evolve and 

* This work was mainly prepared during the stay of the first author at Lab. Specifica- 
tion & Verification in June- July 1998, and was partly supported by INTAS-RFBR 
(Grant 95-0378) and the Russian Fund for Basic Research (Project No. 96-01-01717) 

^ The word “nested” points to the analogy with nested sets, containing sets as their 
elements, which in turn may contain sets and so on. There may be any fixed number of 
levels in nested Petri nets. It is also possible to consider nested nets with unbounded 
depth, but we do not do this here. 



D. Bj0rner, M. Broy, A. Zamulin (Eds.): PSI’99, LNCS 1755, pp. 208—220, 2000. 
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disappear during the system lifetime, and their number is unlimited. A nested 
Petri net has four kinds of steps. A transfer step is a step in a system net, which 
can “move” , “generate” , or “remove” objects, but does not change their inner 
states. An objeet- autonomous step changes only an inner state in one object. 
There are also two kinds of synchronisation steps. Horizontal synchronisation 
means simultaneous firing of two object nets, situated in the same place of a 
system net. Vertieal synchronisation means simultaneous firing of a system net 
together with some of its objects involved in this firing. 

In this paper we show how some crucial verification problems remain decid- 
able for nested Petri nets and some become undecidable. This shows that nested 
Petri nets are in some weaker than Turing machines and stronger than ordi- 
nary, “flat” Petri nets. The decidability results are mostly based on the theory 
of Well-Structured Transition Systems [Fin90, ACJY96, FS98]. 

The paper is organised as follows. In section 2 we start with a simple example 
of a two-level nested Petri net with ordinary Petri nets as tokens. Section 3 
contains definitions of nested Petri nets. In Section 4 the expressive power of 
nested Petri nets and some other Petri nets models is compared. In Section 5 we 
prove that nested Petri nets are well-structured transition systems and deduce 
some decidability and undecidability properties. Section 6 gives some conclusions 
and directions for further research. 



2 An Introductory Example 

To give the reader an intuitive idea of nested Petri nets we start with a small 
example of a two-level nested Petri net APA represented in Fig. 1. It models a 
set of workers receiving some tasks from time to time. A worker’s behaviour is 
described by an object (element) net EN. EN is an elementary Petri net. When 
a task comes, a worker is to borrow a tool from the buffer of tools. A buffer of 
tools is represented by a system net SN. It is a high-level Petri net with tokens 
of three types: black dots, tools (unstructured dots of some color) and workers 
(represented by nets). 

The number of workers involved in this system is unlimited. The set A of 
tools is fixed and initially represented in the place S'5. In our example A is finite 
(with A elements). 

Arcs in the system net SN are further labeled by expressions (variables and 
constants in our example), as in high-level Petri nets. If no expression is as- 
cribed, the arc is supposed to transfer a black dot. In our NPN example, the arc 
expression a; is a variable for a worker (having a marked net EN as its value) , y 
is a variable for a tool (a corresponding arc transfers a coloured dot for a tool), 
W2 is a constant for an element net EN with the marking {IF2}, i.e., having 
only one token in place W 2 ■ 

Some transitions are marked by labels t\,t 2 ,t^Hi in EN and t\,t 2 ,t^Hi in 
SN. They are used for synchronisation of transition firings in system and element 
nets. Thus transition marked by t 2 in SN may only fire simultaneously with the 
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Fig. 1. NPN, a nested Petri net 



firing of a transition marked by t2 in the element net EN which is involved in 
firing t2 (i-e, which is transfered by it). 

The substantive meaning of places in the element net EN is as follows: T — 
there is a task for the worker; Wo — the worker is idle; Wi — the worker has 
got a task; W2 — the worker is applying for a tool; W3 — the worker is busy 
with a task; W3 — the worker finished a task. 

In the system net places are: S*! — a buffer of tools is open; S2 — workers, 
applying for a tool; S3 — workers with tools; 5*4 — the number of borrowed tools; 
S'5 — tools available; Sq — workers, returning tools. 

In the initial marking represented in Fig. 1 the system net SN (modelling a 
buffer of tools) contains a black dot token in the place Si (meaning the buffer is 
open for workers) and the set A oi N tools in the place S'5, a net EN for a worker 
contains a token in the place Wq (a worker is idle) and T is empty, meaning there 
are no tasks for a worker. Note that initially there are no net tokens (workers) 
in SN, so the element net EN for a worker plays a role of type description. 

To illustrate the behaviour of a nested Petri net we follow several possible 
steps of NPN. In the initial marking the unlabelled transition in SN may fire, 
putting a net token W2 (FN with marking {W2}) into place S2. This step creates 
an instance of EN in S2 ■ After that the transition marked by t2 in SN may fire 
synchronously with the transition marked by t2 in the element net lying in S'2. 
After that the net FA with the marking {IT3} will be situated in the place S3, 
the set A in S'5 will be diminished by one token and the place S4 gets one token. 
Then the transition marked by in SN may fire synchronously with transition 
marked by ti in the element net lying in S3. Continuation of this process may 
lead to a marking shown in Fig. 2 , where there is one worker applying for a 
tool, two workers with tools and one worker has come to return a tool. Here A' 
designates the set A diminished by three tokens. 










Fig. 2. An example of a reachable state for NPN 



3 Nested Petri Nets 

Definition 3.1. Let P and T be disjoint sets and let F C (P x T) U (T x P). 
Then N = (P,T,F) is a net. The elements of P,T and F are ealled places, 
transitions and arcs respeetively. 

Pictorially, P-elements are represented by circles, T-elements by boxes, and the 
flow relation F by directed arcs. For x E P U T we write *x for the pre-set 
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{y I yFx} of x, and x* for its post-set {y \ xFy}. The input arcs of a transition 
t are those in {(x, t) \ x ^ *t}, its output arcs are those in {{t,x) | a: G t*}. 

Markings. In the Coloured Petri nets formalism [Jen92], places carry marked 
multisets of coloured tokens. Recall that a multiset m over a set S' is a mapping 
m : S ^ N, where N is the set of natural numbers, m is finite iff {s G S | m(s) > 
0} is. We let m < m' (resp. m + m') denote multiset inclusion (resp. sum). By 
Sms we denote the set of all finite multisets over S. 

Definition 3.2. Letfif = {P, T, F) be a net and S an arbitrary set. A marking of 
Af over S, also called an S-marking, is a function M from P to Sms mapping 
every place to a multiset over S. A marked net is a net together with some 
marking, called the initial marking of this net. 

In the above definition tokens may be arbitrarily complex objects (as in Coloured 
Petri nets). In nested Petri nets, tokens may be nets. 

Transitions. As with Coloured Petri nets, we want to keep track of moving 
tokens. For this we label arcs with variables and other expressions. 

Let V = {m,...} be a set of variable names, and C = {ci,...} a set of 
constant names. Write A for the set V U C oi atoms. An expression is a finite 
multiset of atoms (usually written with the binary symbol +: e.g., Vi + (c 2 + Ui) 
is an expression). Expr{A) is another way of denoting Ams = {e, ■ ■ .}, the set 
of expressions. For e G Expr{A), Var(e) is the set of variables occurring in 
expression e. 

Assume any constant c denotes a fixed element cs in S. Assume b maps any 
variable v to an element b{v) G S. Then b{e) denotes a multiset over S in the 
obvious way. 

Let Lab = {li,l 2 ,---} and Lab' = be two disjoint sets of labels. 

For each label I G Lab U Lab' we define an adjacent label I, such that the sets 
Lab, Lab' , Lab =def M •= La&} and Lab' =def {^' I •= Lab'} are pairwise 

disjoint. Let 1 =def ^ and £ =def C Lab' U Lab U Lab' . 

Now we come to the definition of a nested Petri net structure, consisting of 
a system net, several element nets, labels on arcs, and labels on transitions. 

Definition 3.3. A nested Petri net structure A is an array of k > 1 nets A/i, 

. . . , A/fe, where is a distinguished net, called a system net, and the Mi ’s, for 
i = 2, . . . ,k, are called element nets. 

In any Mi = (Pi,Ti,Fi) the input (resp. output) arcs from Fi are labeled 
by expressions £{p,t) (resp. 8{t,p)) from Expr{A). We reguire that there are 
no constants in input arc labels and no variable occurs twice in an input label 
S{p,t), or in two input labels for a same transition. (There is no restriction on 
the output labels.) Examples of forbidden arc inscriptions are shown in Eig. 3. 
In any Mi, the transitions may carry labels from C (possibly several labels). 

Assume a given nested Petri net structure S and let markings in element 
nets M 2 ,. . . ,Mk be considered over some finite sets S 2 ,. . . ,Sk correspondingly 
and let A4 denote the set of all marked element nets of A. 
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Fig. 3. Forbidden input arc inscriptions 



Definition 3.4. A nested Petri net (NP-net) is a nested Petri net strueture S 
with each constant ceC interpreted as some marked element net from M.. 

By a marking of a NP-net, we mean a marking of its system net over the set 
M. 

A marked NP-net is an NP-net together with some (initial) marking. 

Note that the definition of an NP-net depends on the sets S 2 , ■ ■ ■ , Sk as param- 
eters. If the Sfs are one-element sets, then the object nets are ordinary Petri 
nets with black dots as tokens, and a nested Petri net is just a system net with 
ordinary nets as tokens. If the Sfs are sets of coloured tokens, then a nested 
Petri net has Coloured Petri nets as tokens. If the Sfs are sets of marked nets, 
we get a three-levels (or more) structure, in which element nets are system nets 
with respect to the next level. It’s clear that we can have as many levels as we 
like. And at last, if some of sets S' 2 , -Ss, ■ ■ ■ ,Sk contain the system net A/i, as its 
element we get recursion, which is not considered here. 

In NP-nets, firing a transition requires instantiating the variables in arc la- 
bels: 

Definition 3.5. Let Mi = (Pi,Ti,Fi) be a net in a nested net NPN. 



1. A binding of a transition t E T{ is a function b mapping each variable v E V 
to a value b{v) from the set At U S '2 U S's U . . . U S'fe. 

2. A binded transition is a pair Y = (t,b), where t is a transition and b is a 
binding oft. 

3. A binded transition Y = {t, b) is enabled in a marking M of Mi iff 'ip E *t : 

b{£{p,t)) Q M{p). 

4 . An enabled binded transition Y = (t, 6) may fire in a marking M and yield 
a new marking M' , written M[Y)M'. For any p E Pi, M'fp) =^^^M{p) — 
h{8{p,t)) + b{8{t,p)). 

5. For marked element nets (except black dot tokens), which serve as variable 
values in input arc expressions from 8{t), we say, that they are involved in 
firing oft. (They are removed from input places and may be brought to output 
places oft). 



Now we come to defining a step in a NP-net. 
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Definition 3.6. Let NPN be an NP-net. A step of NPN is either 

a transport step: firing (through some appropriate binding) an unlabeled tran- 
sition in the system net , not ehanging markings of element nets; 
an object-autonomous step: firing an unlabeled transition in one of the el- 
ement nets, while all element nets remain in the same plaees of the system 
net; 

an horizontal synchronisation step: simultaneous firing of two transitions 
of two element nets lying in the same plaee w.r.t. the same binding, provided 
these two transitions are marked by two adjaeent labels I and 1 from Lab' U 
Lab' ; 

a vertical synchronisation step: simultaneous firing of a transition t marked 
by a label I G Lab U Lab in the system net and transitions marked by the 
adjaeent label I in element nets involved in firing of t. 

We say a marking M' is (directly) reachable from a marking M and write M 
M' , if there is a step in NPN leading from M to M' . 

An execution of NP-net NPN is a sequence of markings Mq M\ M 2 ■ ■ ■ 
successively reachable from the initial marking Mq. 

4 Nested Petri Nets and Other Petri Net Models 

In this section we compare expressive power of nested Petri nets with some other 
Petri net models. First of all, since tokens in a system net may be just black 
dots, we immediately get 

Proposition 4.1. Ordinary Petri nets form a special case of nested Petri nets. 

Then we compare nested Petri nets with some extensions of ordinary Petri 
net model. 

Petri nets with reset ares [Cia94] extend the basic model with special “reset” 
arcs, which denote that firing of some transitions resets (empties) the corre- 
sponding places. 

Theorem 4.2. Petri nets with reset ares can be simulated by nested Petri nets 
with ordinary Petri nets as object nets. 

Proof. The idea is to simulate the presence of n tokens in a place by one simple 
element net having n tokens. Then it is possible simulate the effect of a reset arc 
by removing this net token in one step, replacing it with EO, a constant net with 
zero tokens. Incrementations and decrementations of tokens in a place are sim- 
ulated by incrementations and decrementations of tokens in the corresponding 
element nets. They are enforced by the synchronisation mechanism. 

Fig. 4(a) shows a fragment of a Petri net with n tokens in a place p, in- 
crementing (for p) arc (t+,p) and decrementing arc [t-,p). Fig. 4(b) represents 
a fragment of a NP-net, simulating it. Here n tokens in p are replaced by one 
net token EN, which has one place with n black dot tokens and two transitions 
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marked by and 1-, which add or remove a token to/from p. These transitions 
fire synchronously with transitions or respectively in a system net. Tran- 
sition tr in a system net removes a et token from p, thus emptying it. □ 





(a) (b) 

Fig. 4. Simulation of a reset arc 



Since it is known [DFS98, DJS99] that Petri nets with reset arcs are more 
expressive than ordinary Petri nets, we immediately get the following 

Theorem 4.3. Nested Petri nets with ordinary Petri nets as object nets are 
more expressive than “flat” ordinary Petri nets. 

5 Decidability for Nested Petri Nets 

In this section we discuss some issues of decidability for nested Petri nets. First, 
we briefly formulate some problems crucial for verification of Petri nets. 

A net terminates if there exists no infinite execution ( Termination Problem). 
A marking M' is reachable from M, if there exists a sequence of steps leading 
from M to M' {Reaehability Problem). The reachability set of a net is the set of all 
markings reachable from the initial marking. A net is bounded if its reachability 
set is finite [Boundedness Problem). The Control-State Maintainability Problem 
is to decide, given an initial marking M and a finite set Q = {q\,q 2 , . . . ,qm\ 
of markings, whether there exists a computation starting from M where all 
markings cover (are not less than w.r.t. some ordering) one of the qfs. The dual 
problem, called the Inevitability Problem, is to decide whether all computations 
starting from M eventually visit a state not covering one of the qfs, e.g. for Petri 
nets we can ask whether a given place will eventually be emptied. 
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Since NP-nets simulate Petri nets with reset arcs, problems undecidable for 
Petri nets with reset arcs are also undecidable for NP-nets. 

Theorem 5.1. 1. Reachability is undecidable for nested Petri nets. 

2. Boundedness is undecidable for nested Petri nets. 

Proof. Due to Theorem 4.2 nested Petri nets can simulate Petri nets with re- 
set arcs, hence, validity of this two statements follows from undecidability of 
reachability [AK77] and boundedness [DFS98, DJS99] for Petri nets with reset 
arcs. □ 

To obtain decidability results we use the notion of well-structured transition 
system introduced in [Fin90, ACJY96]. Recall that a transition system is a pair 
S = {S, — >) where S is an abstract set of states (or configurations) and S X S 
is any transition relation. For a transition system S — {S, — >) we write Succ{s) 
for the set {s' G S' | s' ^ s} of immediate successors of s. S is finitely branching 
if all Succ(s) are finite. 

A well-structured transition system is a transition system with a compatible 
wqo: recall that a quasi-ordering (a qo) is any reflexive and transitive relation 
< (over some set X). 

Definition 5.2. A well-quasi-ordering (a wqo) is any quasi- ordering < such 
that, for any infinite sequence xq, Xi,X 2 , . . . , in X , there exist indexes i < j with 

Xi < Xj. 

Note, that if < is a wqo, then any infinite sequence contains an infinite increasing 
subsequence: Xig < xq < xq . . . 

Definition 5.3. A well- structured transition system (a WSTS) is a transition 
system X = (S', <) equipped with an ordering <C S x S between states such 

that 

- < is a wqo, and 
-< is “compatible” with 

where “compatible” means that for all si < t\, and transition si ^ S 2 , there 
exists a transition ti t 2 , such that S 2 < t 2 - 

[FS98, FS97] introduce more liberal notions of compatibility: 

A WSTS A has transitive compatibility if for all si < t\, and transition 
Si ^ S 2 , there exists a nonempty sequence t\ ^ t 2 ^ . . . ^ tn with S 2 <tn- 
A WSTS X has stuttering compatibility if for all si < t\, and transition 
Si ^ S 2 , there exists a nonempty sequence t\ ^ t 2 ^ . . . ^ tn with S 2 < tn and 
Si < ti for all i < n. 

Now we define a wqo on the set of states of our NP-nets and show that they 
are WSTS. 

Definition 5.4. Let NPN be a nested Petri net, M.ms — Ihe set of all its states. 

A quasi- ordering A on A4 ms *s defined as follows: 
for Ml, M 2 G Mms- Ml A M 2 iff for all p G P^/^ there exists an injective 
function jp : Mi{p) M 2 {p), such that y{Afi,m) G Mi{p), for s G Mi{p): 
either jp{s) = s or s = {J\fi,m) and jp{{Afi,m)) = implies m Am' . 
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Fig. 5 shows an example of two markings Mi, M2 of some NP-net, ordered 
w.r.t. Here the system net has three places Pi,P2,P3- The only element net 
has places qi,q2- In both markings the place pi is empty and the place p2 con- 
tains one net token, but the marking of this net token in Mi is included in the 
corresponding marking in M2. The place ps in M2 contains the same net token 
as in in Mi plus one more net token. Thus, the relation ^ is a kind of a nested 
set inclusion. 





Fig. 5. An example of markings Mi A M2 



Proposition 5.5. Let NPN be a nested Petri net, with M-ms the set of all its 
states, the step relation on M.ms, and A the quasi-ordering on A4 ms, defined 
above. Then ts a well-structured transition system. 

Proof. A is clearly a well quasi-ordering but we must show that it is compatible 
with the transition relation We have four cases. 

1. Let Ml ^ M[ be a transport step in a nested net via a transition t and let 
Ml A M2. Then for every token s G Mfip) transferred by t (with p G *t) 
there exists an object jp{s) G M2(p). Since due to the restriction on input 
expressions all objects are transferred independently and firing of t doesn’t 
depend actually on object markings, the transition t is enabled also in M2. 
It is easy to see, that if M2 ^ M^, then M( A M2. 

2. For an object-autonomous step compatibility is obvious. 

3. A horizontal synchronisation step is a simultaneous execution of several 
object-autonomous steps. Its compatibility can be proved analogously to 
the previous case. 

4. A vertical synchronisation step is a simultaneous execution of a transport and 
several object-autonomous steps. Its compatibility is not a direct implication 
from the two first cases, but it can be proved by combining previous proofs. 

□ 
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Note that if we would not restrict multiple occurrences of variables in input arc 
expressions, we would not have WSTS, as well as Object Petri nets of Valk are 
not WSTS. 

It was proved in [FS97] that 

— Termination is decidable for WSTS’s with (1) transitive compatibility, (2) 
decidable <, and (3) effective Succ{s). (Theorem 4.6.) 

— The control-state maintainability problem and the inevitability problem are 
decidable for WSTS’s with (1) stuttering compatibility, (2) decidable <, and 
(3) effective Succ{s). (Theorem 4.8.) 

It turns out that for NP-nets 

Lemma 5.6. (1). The qo A is decidable. 

(2). Succ is effective. 

With the help of these statements we can obtain the following decidability 
results for NP-nets: 

Theorem 5.7. Termination is decidable for nested Petri nets. 

Proof. Follows from Proposition 5.5, Lemma 5.6 and Theorem 4.6 in [FS97]. □ 

Corollary 5.8. Nested Petri nets are expressively strictly weaker than Turing 
machines. 

Proof. Since termination is not decidable for Turing machines. □ 

Theorem 5.9. The control-state maintainability problem and the inevitability 
problem (w.r.t. Aj are decidable for nested Petri nets. 

Proof. Follows from Proposition 5.5, Lemma 5.6 and Theorem 4.8 in [FS97]. □ 

6 Concluding Remarks 

Nested Petri nets are an extension of the Petri nets formalism which gives vi- 
sual and clear dynamic hierarchical and modular structure of the system. The 
synchronization of hierarchical components is natural and powerful. 

The structure of nested Petri nets gives a good intuition of its distributed 
behaviour. Though we have only defined here an interleaving semantics, it can 
be naturally generalised to simultaneous or independent firings. With two kinds 
of synchronisation: horizontal for cooperation of elements and vertical for co- 
ordination of system and its elements nested Petri nets formalism can be consid- 
ered as a kind of generalisation of module Petri nets (see, e.g., [CP92, Lom97a]) 
and hierarchical Petri nets (e.g., [Jen92]) models. 

Thus, nested Petri nets turns out to be a visual and expressive tool for 
modelling multi-agent distributed systems. At the same time, decidability of such 
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important properties as termination gives ground for solving some verification 
problems for them. Being still less expressive than Turing machines, nested Petri 
nets preserve merits of Petri nets model. 

Further research on nested Petri nets supposes investigation of recursive 
nested Petri nets, when a system net contains its own copy as its element (di- 
rectly or via other elements) and decidability questions for them. 
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Abstract. This paper describes an algebraic structure which allows to 
define an abstract framework for communication in distributed systems. 
Using this structure we introduce an equivalence relation; the quotient 
induced by this equivalence relation preserves the initial algebraic struc- 
ture. Our results represent a starting point in abstract investigation of 
the communication between processes, complementing the achievements 
of the process algebras. 



1 Introduction 

The aim of this paper is to find a class of structures as mathematical abstractions 
used for structural and dynamic aspects of communication between concurrent 
processes. We refer here just to the structural aspects of the communication 
between processes. In process algebra, processes are usually considered as terms. 
We treat a process by considering its internal structure which is only related with 
communication, namely its ports and communication symmetries. This work is 
somehow related to those of Milner [Mi] (flowgraphs, action structures) and 
Lafont [La] (interaction nets). The name-free approach presented here is similar 
to that of [Hoi, Ho 2]. 

We describe how two processes can communicate each other by using communica- 
tion handles. We consider mainly correspondences and maps of suitable handles 
used by communicating processes. The basic elements of our approach are the in- 
terface ports used by processes for interaction. We give a name-free presentation 
where the notion of symmetry plays an essential role. Communication between 
processes assume points of interaction called handles (ports). Each communi- 
cation channel is determined by two points of interaction. Various concurrent 
algebras and calculi use names for communicating channels, i.e. they use names 
for the corresponding interaction points which determine the communication 
channels. This fact suggests that the semantics of processes depends on names. 
Here we “forget” names, but still keep their functionalities. The essence of the 
functionality of channel names is to determine multiple identities and to rep- 
resent in an intelligible way how we relate the communication entities. Over 
the set of interaction points given by the communication channels, we define a 
permutation group. Each permutation represents an interchange of interaction 



D. Bj0rner, M. Broy, A. Zamulin (Eds.): PSI’99, LNCS 1755, pp. 221—227, 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 




222 



Gabriel Ciobanu and Emanuel Florentin Olariu 



points which preserve the external communication capabilities of a process. In 
this way we introduee a communication structure which is based on the notions 
of process, communication channels, interaction points and permutations. These 
structures are designed to make easy the introduction of a suitable notion of 
congruence, and then a suitable notion of quotient. Our main result is that any 
two quotients are isomorphic. 



2 Communication Structures and Correspondences 
Definition 1. A communication structure is given by the following elements: 
(i) a set V of proeesses - p,q,s,. . . range over V; 

(a) every process p eV has a set of handles (interaetion points) H (p) ; 

(Hi) for every proeess p e V, and for every subset K C H{p), there exists a 
permutation subgroup over K; 

(iv) for every H c B' C H{p) we have: 

(a) if pe S'fj, and p/w\H = then p/n € 

(b) is a subgroup in S^, (we use the extension by identity on H' \H). 

The set H (p) is the set of all communication points of a process p. The permuta- 
tions of describe the internal symmetry of p, which express the possibility 
to interchange the communication points without affecting the external commu- 
nication between processes. 

Definition 2. Let V and Q be two eommunication struetures; a correspondence 
between two processes p G V and q e Q is a triple {S^, p, ^h’)’ where H Q H{p), 
H' C H{q), and p : ^ Sj^, is a group morphism. We denote by d{P,Q) the 

set of all correspondences between processes from V and Q. 



3 Composing Correspondences. Maps 

Definition 3. Let V , Q and S be three process structures. The composition of 
correspondences is a partial binary operation 



"o" : d{P,Q) X d{Q,S) ^ d{P,S) 



defined by 

{Sfj,p,Sjj,)o(Sjj,,p', Sf,„ ) = {S%,p' op, Sf ,„ ) , 
where peP,qeQ, seS and H C Hfp), H' C H{q), H” C H{s). 



Remark: It is not difficult to prove that the composition of correspondences is 
associative. 



Definition 4. LfP and Q are two eommunication structures, a map from V to 
Q is a set of eorrespondenees T C d(P, Q) sueh that for every proeess p eP and 
every subset H C H{p), there exists a unique eorrespondence {Sf^,p, Sj^,) G P, 
where q e Q and H' C H (q). We denote this map by P :P — > Q. 
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\i T \ V ^ T' \ Q ^ S are two maps, then we define their composition 
T' o T \ P ^ S Q& the set of all possible compositions of correspondences from 
T and J-' . The composition P' o P also a map. 

Definition 5. Let V he a eommunication strueture. 

(i) identity map over V , 1-p -.V , is the set {{Sjj,id, Sjj) : H C H{p)}. 

(ii) a map P : V ^ Q is an isomorphism if there exists another map P' : Q ^ 
V sueh that Pop' = 1q and P o P = 1-p. In this case P' is the inverse 
map of P , and it is denoted by P^^ . 

4 p— Translations and p— Equivalences 

Definition 6. Let V be a eommunication strueture. 

(i) for two given processes p,q e V, a translation from p to q is a triple 

where H C H{p), H' C H{q) and 6 : H ^ H' is a bijective 
function, denoted by pn qw or, if it is possible, simply by 6 . 

(ii) two translations pn qn' and pn qn' from p to q are equivalent 
(and we denote this by hi ~ ^ 2 ^ if we have two permutations p e Sfj and 
p' G 5'^, such that 6 i = p' o 62 o p; 

(Hi) a p-translation 3? : V — > V over the communication structure V is a 
family 3? of translations from V to V with the following properties: 

- for every translation h G 3?, if S ^ 5' , then 6 ' G 3?, and 

- if ph ph G 3?, then 5 G . 

The relation defined over translations is an equivalence relation. 

We define the following operations over translations and p-translations: 

(i) the inverse of a translation pn q^' is the translation qn' ^ ^ Ph', 
the inverse of a p-translation 3? is the set : 5 G 3?} denoted by 3?^^; 

(ii) the composition of two translations pn qn' and qn' sh" is the 

translation pn sh" ; the composition of two p-translations 3?i and 3?2 over 
the same communication structure is the family {hi o h 2 : hi G 3?i,h2 G 3?2} 
(whenever the composition hi o h 2 is possible), and it is denoted 3?i o 3?2- 

Lemma 1. 

(i) the inverse of a p-translation is also a p-translation; 

(ii) the composition of two p-translations over the same structure is a p-tran- 
slation. 

Proof, (i) Let 3? : V — > P be a p-translation, h G 3?, and let p be a translation 

such that h^^ p. If pn qjj', then q^' ^ ^ pu, and qn' Ph', 
moreover, there exist two permutations p G S^,p' G Sfj, such that p o 
h^^ o p' = p. As a consequence, we have h p^^ G 3?, and p G 3?^^. 

The second condition of the p-translation definition is obviously satisfied 
by 3?-b 
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(ii) Let : V — > V, ^2 '■ "P — ^ P two p-translations, and two translations 

Ph Qh' € 3?i, qH' sh" € 3?2- If Ph s_y" is a translation from 
p to s such that S' o S p, then we can find two permutations p € 
and p' G Sjj„ such that p' o S' o S o p = p. For every permutation a G Sj^-, 
the above equality becomes p = {p' o S' o a) o o S o p). If we denote 
S 2 = p' o S' o a, and di = o S o p, then we have G 3?i, <f 2 € 3^2 and 
p = (^2 o di G 3?2 o 3^1 • 

The second condition from the definition of p-translations is easy to be 
verified. 

Definition 7. 

(i) the identity p-translation over a eommunication strueture P , denoted by 
id'p, is the family { pn Ph ■ p & P, H Q H (p), p G Sfj } ; 

(ii) a p-equivalence is a p-translation 3? defined over a communication structure 

P which is reflexive (id-p C ?ft), symmetric C 3?J, and transitive 

(«o3?c 3?;. 

Remark: The used definition of the correspondences between two processes has 
the advantage that it does not require an equivalence relation like in [Hoi, Ho2]. 
The correspondences are not used to define translations and p-translations and 
this fact helps us to differentiate between maps and equivalences. 

5 Quotient Structure. Isomorphism Theorem 

In this section we define the quotient related to a p-equivalence, and we show 
that this quotient can get a communication structure which is unique up to 
an isomorphism. Let 3? : P — > P be a p-translation, and we consider the 
set p = {{p,H) : p G P,H C H{p)}; for the sake of simplicity, we use the 
notation pn instead of (p, H). Then we define the binary relation ^ x ^ by 

Ph Qh' if and only if there exists a translation pn qn' € 3?. 

Lemma 2. is an equivalence relation over p if and only ifdiisa p-equiva- 
lenee. 

Proof. is reflexive because idp C 3?. Symmetry comes from the fact that 

3?^^ C 3?. For transitivity, let pn qw ■, and qpt rpn G 3?; since 3? o 3? C 

3?, then pu sun G 3?. For the other implication (i.e. only if) the proof is 
similar. 

We will denote the quotient p/t^^ by 3?~, and the equivalence class of pn by 
[ph] ■ We choose a representative from every equivalence class and for each such 
family of representatives, we can build a communication structure on 3?~ in the 
following way : 




Abstract Structures for Communication between Processes 



225 



- the set of processes is the family of the all equivalence classes [ph]', 

- for every equivalence class [ph], Ph is the representative we have chosen 
above, and we define H{[ph]) as being the set H (which is included in H{p)); 

- for every equivalence class [ph], and K C H{[ph]), the corresponding per- 
mutation group is (which exists from our initial communication structure). 

Theorem 1. The structure defined above on 3?^^ is a communication structure. 

Proof. It is enough to verify only the condition (iv) of the communication struc- 
ture definition. Let [pn] be an equivalence class; according to the above con- 
struction, for this class we have H{[ph]) = JT- We consider C H" C H. li 
p € S^„ with p/ h"\H' = id/ h"\H’ > then p/ h' € S^, - this relation comes from 
the original structure. On the other hand, S^, < S^„ (by extending the permu- 
tations) - this is valid by the definition of our initial communication structure 

V. 



According to the construction described above, two different representative 
choices can determine different structures on the quotient The following 
theorem shows the relationship between these structures. 

Theorem 2. Any two structures determined by different representative choices 
are isomorphic communication structures. 

Proof. Let {pijj ■ i <E 1} and {qijj' : * € 1} two families of representatives se- 
lected from the equivalence classes such that pij^ for every i <E 1. We 

denote by and the structures determined by these families of represen- 
tatives. Since pij^ « QiH'^ have a translation pijj qifj, e 3?, for every 

i e 1. 

Let Hi C H{[pifj]) = H and = 5i{Hi). Consider now p\ e we 

extend pi on H\Hi by identity, and we obtain p € Sff. On the other hand, 
€ 3? - from the symmetry of 3?, and p € 3? - from the reflexivity of 3?. Now, 
from the transitivity of 3?, we have 

p' = Si o p o € 3?. 

This means qif^i q^^, e 3?, which implies p' € Sff,. 

We show now that p' / h'\h[ = idH'\H[- Let h be in H' \H[; then 6i^^{h) e 
H\Hi and po5H^{h) = 6H^{h); it follows that p'{h) = h for every h € 
Therefore 



Pi 



p'/m & 






In this way we can define ^ S'//, by pi^^{pi) = p\. In order 

to simplify our notation, we denote by p, and we prove that p is a group 
morphism. 
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First of all it is easy to see that - Let G 5 ”^^ be two 

permutations and p,iT £ their corresponding extensions by identity on H \ 
Hi; we have cti = pi o tti G , and ct = p o tt G . Then we have 

p' = 5 iopo 5~^, Ti' = SiOTv O e S% 

Pi = p' I H [ , T^'i = tt'/h; e . 

If we consider a' = 6i o a o , then 



V{Pi OTTi) = 93(0-1) = a[= a' !h[- 
For every h £ H[ we have 



a' (h) = {{Si o p o 6i ^)o {Si OTTO Si ^)){h) = {p' o Tr'){h) = 



p'{Tr[{h)) = p'i{Tr[{h)) = {p[ o Tr[){h). 

In this way <p{pi otti) = (p(pi) o (p(7Ti), i.e. (p is a group morphism. 

Now we can define ^ we use 'tp as simplified notation of 

. For each permutation p'l G , we consider p' G - the usual extension 
to H' . Then we have p = Si^^ o p' o Si G S^, and we define 

HPi) = /3l = d/ffi- 

Following similar arguments as above, we can prove that V’ is a group mor- 
phism. 

The family fF = { 93^^ : i £ I, Hi C H } is a, map from 3 ?^^ to , and SF' = 
{ '■ i & I , H'l C H' } is a map from to T' is the inverse map of T 

since for every i G I we have 

931-^1 o = idgii and ipi^^ o cpi^^ = idgPi . 

til 



6 Conclusion 

This paper is an attempt to define and study a formal framework for communi- 
cation between processes of a distributed system. It introduces communication 
structures, an abstract notion which could lead to a complementary point of 
view to that of process algebras (CCS, ACP, 7r-calculus, action calculi etc). 

Communication structures are essentially sets of processes. Each process is 
equipped with a set of interaction points called handles, and with a family of per- 
mutations over this set. The main contribution of the paper is the definition of an 
equivalence on communication structures, and the construction of the respective 
quotient structure, where an equivalence consists of suitable correspondences 
between sets of handles of processes. 
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Abstract. We consider systems of cooperating logic programs which 
generalize dynamic deductive databases (DDDBs) from [1,2]. Some prop- 
erties of the system behavior are defined which ensure an infinite steady 
life of the system. Decision problems for these properties of cooperat- 
ing productional logic programs are investigated. It is shown that these 
problems are reducible to the satisfiability problem for the propositional 
temporal logic of branching time. It follows that stability problems for 
the cooperating productional logic programs are decidable with the ex- 
ponential time complexity. 



1 Introduction 

In this paper we consider a logical approach to the mathematical analysis of the 
behavior of interactive discrete dynamic systems. A state of a dynamic system 
is represented by a data base state {DB state), i.e. a finite set of facts. The 
behavior of the system is determined by actions of a set of logical programs 
which update the DB states. These actions generate a set of possible trajectories 
of the system, i.e. sequences of DB states. Different requirements on the system 
behavior can be defined in terms of conditions which should be satisfied by the 
set of trajectories. Here we consider only one of interesting kinds of the behavior 
properties which are expressible in such terms, namely, the stability property of 
the system. Moreover, we limit our consideration by the case when the system 
B consists of n + 1 (in general, nondeterministic) logic programs, a master 
MP and a set of slaves SP =< SPi, SPn >, which work over a (finite 
dynamic) database S , updating states of S in turn: on every odd step slave 
programs concurrently change the current database state, and on even steps the 

* This work was sponsored by the Russian Fundamental Studies Foundation (Grants 
97-01-00973 and 98-01- 00204). 

D. Bj0rner, M. Broy, A. Zamulin (Eds.): PSB99, LNCS 1755, pp. 228—234, 2000. 
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master program updates the database state with the aim of restoring integrity 
constraints possibly violated by the slaves in the previous step. This notion 
of cooperating (symbiotic) logic programs generalizes the notion of dynamic 
deductive databases with external updates from [1,2]. 

The binary relation on the DB states induced by the updates executed by 
the set LP of logic programs LPi,...,LPn we denote by \~lp (with the aim to 
avoid some inessential complications we consider in this paper only systems LP 
defining total relations \~lp , so below we can consider only infinite trajectories 
of the systems). Then local behavior of the system B in the current state So 
is described as one interaction of SP and MP applied to this state, i.e. 
as the sequence of two updates So \~sp S[ \~mp S\. Normally, one should 
distinguish between the acceptable and not acceptable interactions, depending 
on a criterion of admissibility of the system states. Each acceptable interaction 
applies to an admissible state So and yields an admissible state S\. However, 
the intermediate state S[ may in general be inadmissible, in which case the 
reaction of MP compensates for the destructive actions of SP . We represent 
the admissibility criterion by an integrity constraint (IC) expressed by a formula 
<P over DB states. In terms of the IC the acceptability of the interaction is 
expressed as follows: the interaction of the form above is acceptable if |= ^ 
and Si ^ #. Thus, the system B representing the interactive discrete dynamic 
system has in fact the form < MP, SP =< SPi, ..., SPn >, ^ >, and its local 
behavior is expressed in terms of acceptable interactions. 

Global behavior of the system in current state So is represented by (infinite) 
sequences of interactions starting in So which we call trajectories 
So bsp S[ \~MP Si \~SP S'2 \~MP S2 ... 

A trajectory whose all local interactions are acceptable represents the stable 
behavior of the system: any possible destructive action of the slave programs 
SP is compensated by some action of the master program MP along all the 
trajectory. Such the trajectories are called stable. 

The trajectories of the system B form a tree T{So) with the root So- A 
number of natural properties of interactive behavior of B in a given DB state 
can be formalized in terms of this tree, in particular, different kinds of stability. 

Definition 1 Let Qi,Q2 G {V, 3}. Then B is QiQ2-stable in DB state So 
if in the tree T{So) there is a QiQ2~subtree in which all infinite branches are 
stable trajectories. ^ 

One of natural questions connected with these notions is to consider algorithmic 
decidability of the stability problems. Of course, in general case these problems 
are undecidable. In [1,2] some classes of systems of logic programs were distin- 
guished for which the stability problem is decidable. But only one slave (which 
represents an internal control program of a DDDB) was allowed there, and a 

^ We omit here the straighforward definition of QiQ 2 -subtrees. E.g., V3 -subtree Ti 
of T has the following properties: if a node N belongs to an even level of Ti , 
then all successors of A in T are also successors of A in Ti , and any node 
belonging to an odd level of Ti has at least one successor (from T ). 
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very simple kind of updates was considered as possible actions of the master 
(which represents an active environment of the system). Here we consider more 
general case when SP is represented by a set of working in parallel programs, 
and MP belongs to the same class as programs of SP. Because of space 
limitations we present here only results for the case when programs MP and 
SP belong to the class GPROD of ground productional logic programs with 
updates. We show that for the cooperative systems of this class the decision 
problems for stability are decidable and have the same decision complexity as 
for the corresponding problems in [1,2]. To prove our results we show that the 
variants of the stability problem which are considered here are reducible to the 
satisfiability problem for a variant of the propositional logic of branching time. 
Moreover, for 33 -stability this reduction is simultaneously a reduction to the 
satisfiability problem for the logic of linear time. As a corollary we obtain that 
results from [1,2] on the polynomial space and exponential time complexity of 
the stability problems for GPROD are generalized to the systems of cooperat- 
ing programs from GPROD (the similar result also holds for the dual notion of 
homeostaticity: in this case the slaves correct destructive actions of the master). 
Moreover, we note that some lower bounds in [1,2] can be improved. Namely, the 
results on EXPTIME-hardness of the stability problem can be complemented by 
the lower bound of the time complexity. 

2 Basic Notions and Definitions 

2.1 Productional Logic Programs 

We consider productional logic programs with updates in a signature U consist- 
ing of a set of constants C and a set of predicate simbols Pr . Let H denote 
the Herbrand base over A. A productional logic program defines the unique 
intensional predicate q ^ Pr by a set of clauses which have the form 

q Coni, . . . ,Conk,Act\, . . . , Actrn 

where each Coni (elementary condition) is either a ground atom of H or 
its negation, and each Actj (action) is one of elementary updates insert(A), 
delete(A) where A e H. In fact, these rules are equivalent to the productions 
used in AI, so in further we use for them their usual syntax: 

ConiSz.-.SzConk =k Act \, ..., Actm- 

We can assume that there are no conflicts in application of actions in produc- 
tions, i.e. there are simultaneously no Acti = insert(A) and Actj = delete(A) 
in one production tt. 

A data base (DB) state £1 is a finite subset of the Herbrand base H. The 
production tt is applicable to a DB state S iff for every I < i < k Coni G f if 
Coui is a ground atom and Coni ^ f if Coni is a negated ground atom. We 
consider here only productinal programs such that for any DB state S at least 
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one production of the program is applicable to S (this assumption is inessential 
and is taken only to make the temporal logic formulas describing below stability 
properties less cumbersome). 

Let 7T =< 7Ti, ...,7T„ > be a set of productions. We define now the simultane- 
ous application of these productions to a state S . It can be defined in different 
ways. We choose here the following one. If there is a production tt^ not applicable 
to S then tt is not applicable to S. In other case the result 'k{S) of simultaneous 
application of productions tt to is defined as a DB state Si obtained from 
S by adding all atoms A such that there is a production tt^ G tt whose action 
includes insert{A) and by deleting all atoms A such that there is a production 
TTi G TT whose action includes delete{A). When for some atom A there are two 
productions in tt one of which wants to insert A and the another one wants to 
delete it, then A does not change, i.e. A e Si A e S . (Of course, another 
strategies of conflict resolution are possible as well, e.g. by introducing some kind 
of priority for programs.) So, the set of productions tt defines an update relation 
hir on the set of all DB states: S hir Si iff Si = tt(S). The update relation 
\~LP induced by a set of productional logic programs LP =< LPi , ..., LP„ > 
is defined as 

^ LP — ^{7r\7r=<7Ti,. ..,7rn>,7Ti^LPi} • 

We consider as integrity constraints (ICs) quantifier-free first order formulas 
over U. We say that a DB state S satisfies an IC <P iff S \= tP. 



2.2 Propositional Temporal Logic 

We use the following variant of propositional logic of branching time (BPTL). It 
differs from the logics CTL and CTL* considered in the survey [3] by presence 
of the past temporal operator MY (” in the previous state” ) , though it is simpler 
in other respects: it does not contain complex temporal operators of the kind 
VP. CTL-like logics with past temporal operators were considered in [4] and [5]. 
The temporal structure used in BPTL is tree- like (branching forwards and linear 
backwards). Such variant of the time structure is also considered in [5] among 
other variants (the time structure used in [4] is branching backwards as well as 
forwards). Other more general than BPTL systems can be found in the area of 
the propositional dynamic logics. 

The formulas of BPTL are constructed from propositional variables by using 
the Boolean connectives and the temporal operators MX, MY and MG (operators 
3X, 3F and 3 P are expressed as -iVX-i, -iVF-i and -iVG-i , respectively). 

Models of BPTL have the form < T,tt > where T is an infinite tree 
with branches of the height u , and tt assigns to any node s of T a set of 
propositional variables satisfied on s (as usual we write s \= p instead of 
p G 7t(s) ). The relation \= is extended to all the formulas of BPTL in the 
following way: 

semantics of boolean connectives is defined as usual; 

s t= MX p iff s' t= P for all sons s' of s; 
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s ^ VY p iff s' 1= p if s is not the root of T and s' is the predecessor 
of s (if s is the root we can assume s |= VY p for any formula p ); 

s 1= VGp iff s' 1= p for all nodes s' of the forward paths beginning with s. 
According to the definition above 3Yp means ’’there exists the predecessor 
s' of s such that s' ^ p ” (the meaning of other operators is also clear). 

The given above version of semantics for BPTL supposes that time structure 
is branching forwards and linear backwards. Another variant of semantics for 
BPTL assumes that time is linear forwards, too. 

3 Reduction of Stability to BPTL 

In this section we construct for a system of cooperating logic programs B =< 
MP, SP =< SPi, SPn >, <P > andDB state S BPTL-formulas representing 
Q 1 Q 2 - stability of B in S. To simplify notations we suppose that n = 2. Let 
MP be the productional logic program 
fi updi 

fm '^pdrrii 

and let SPi be the productional logic program 
/ii =» updn 

flni k Upd\ji^^ 

and SP 2 be the productional logic program 
/21 upd 2 i 

f2ri2 k Upd2rL2 ) 

where fyj are conditions, updij are updates. 

Any DB state £ can be described statically as conjunction Conj{£) of 
(positive ground) atoms occurring in £ . But to reflect changes of states caused 
by actions of MP, SPi and SP 2 we should in following to take into ac- 
count also some negative atoms. So, with any DB state £ and logic programs 
MP, SPi, SP 2 we connect the formula s(£l) which is conjunction of Conj{£) 
and negations of ground atoms occurring in MP, SPi or SP 2 but not in £. 

Note that in fact we can consider ground atoms as propositional letters. 
In what follows Y will denote the set of propositional letters which occur in 
MP, SPi, SP 2 . 

For any update upd which inserts ai,...,afe , deletes b\,...,bi and leaves 
invariant Ci,...,Cm we introduce below a formula UPD with the intended 
meaning: 

s \= UPD iff the following is true: for any £ the formula s{£) is satisfied in 
s iff there exists a DB state £' such that £ is obtained by applying upd to £' 
and s{£') is satisfied in the state s' of T previous to s. 

UPD has the form /\™ ^ {a = 3Y a) /\ Ui A ~^bi. 

For any a: € A we introduce new variables a:^,a;+. Using these variables we 
introduce two variants of U PD : 
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UPD' = (C, ^ 3Fc,) A Ati(a+ A a,) A ALi(^“ A 

UPD" = A^x-)A A™ 1 (c. ^ 3 Fc.) 

A Ati((-3Far ^ A (3Yar ^ (a, = 3F3Fa,))) 

A ALi((-3F6+ ^ -A) A {3Yb+ ^ (A ^ 3Y3Yb,))). 

Formulas U PD' and U PD" are used below to simulate parallel executions 
of productions from SPi and SP 2 ; the variables a;+ and in them are used 
to store the information for the conflict resolution. 

Let Q be a new propositional variable. Then the formula EVEN of the 
form 

Q AVG {Q ^ {yX ^ Q AVX VX Q)) 

expresses the property ” Q is true exactly on the states in even levels of T ” . A 
somewhat more complicate formula THIRD (using some auxiliary propositional 
letter besides Q) expresses the property ” Q is true exactly on the states in any 
third level of T”. 

Let Safety denote the formula 
THIRD Aye (Q ^ (P). 

It is obvious that if this formula is satisfied on the root of T then the integrity 
constraint <P is satisfied in all states at any third level of T. 

Now we are ready to write out the formulas which show reducibility of the 
stability problems for the cooperating logic programs to the satisfiability problem 
for BPTL. 

(33) : B =< MP, < SPi,SP 2 >,<P> is 33-stable in S 
iff the formula 

s{S) A Safety A VG (Q ^ \/"li{fii S3X {U P D',^ A\/]l,{3Y f 2 j A3X {UP D'f- A 

Vr=i(/feA3XC/PI?fe)))))) 

is satisfiable. 

Remark. For this formula the linear and branching time satisfiability coincide. 

(V3) : B =< MP, < SPi,SP 2 >,(P> is V3-stable in S 
iff the formula 

s(^) A Safety A VG (Q ^ Vr=i h^ A \JZi f2i A Ar=i(/i* ^ 3X(C/PP(, A 
A%i{^yf 2 j ^ 3X{UPD'f- A Vr=i(A A 3XUPDk)))))) is satisfiable. 

Similar formulas can be given for the W -stability and the 3V -stability. 
Moreover, it is not difficult to modify these formulas to describe the stability 
properties for the systems of productional programs defining partial relations 
and for some other variants of cooperative execution of productional programs. 

Remark 1. In fact, using the past operators in these formulas can be avoided. 
For the 33 -stability it is rather straightforward (using the linear time logic) 
and does not increase the size of the corresponding formulas. But for the case 
of V3-stability the branching structure of the time is essential. It seems that in 
this case the description of updates without using the past operator needs to 
explicitly consider all the DB states, but it leads to the exponential increasing 
of the formula size. 

Remark 2. The papers [1,2] also contain decidabilty results for the stability 
problems for systems of more powerful logic programs (with variables and some 
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variants of restricted recursion). Some of these results can be also extended by 
reduction of corresponding problems to the quantifier-free temporal predicate 
logic. 

Note that all these reductions have polynomial complexity. In general, the 
reduction formulas have the polynomial size with respect to the size of the orig- 
inal logic programs. So, if we use the exponential time decision algorithm for 
BPTL (such algorithms can be obtained by easy adapting the known algorithms 
for different propositional temporal or dynamic logics, e.g. from [6]) we obtain 
for the stability problems an upper bound of complexity which has the form 
of an exponential on a polynomial. However, we can obtain some more exact 
complexity bound since the complexity of the satisfiability problem for BPTL 
is exponential respectively to the number of subformulas of the formula consid- 
ered (not to the length of the formula), and it is easy to see that the number 
of subformulas in the reduction formulas is linear with respect to the size of the 
original programs. So, we obtain the following 

Theorem 1. (i) The Q\Q 2 -stability problem for eooperating programs in 
GPROD with quantifier- free integrity constraints is decidable in exponential 
time for any Qi,Q 2 G {3,V} ; 

(a) The 33-stability problem for the same classes of programs and integrity 
constraints is decidable in polynomial space. 

The point (ii) is obtained using reduction to the linear time logic (see remark 
above) for which there exists an algorithm with polynomial space complexity (see 

[3]). 

Acknowledgement. We express our gratitude to anonymous referees for 
their helpful comments. 
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Abstract. The rule-based paradigm for knowledge representation ap- 
pears in many disguises within computer science. In this paper we address 
special issues which arise when the rule-based programming paradigm 
is employed in the development of reactive systems. We begin by pre- 
senting a rule- based language RL which has emerged while developing 
intelligent cruise control systems. We dehne a desired declarative se- 
mantics and correctness criteria for rule-based programs which respect 
causality, synchrony assumption and desired determinism. Two alterna- 
tive approaches are proposed to analyze RL programs. Both approaches 
build upon static checks of a rule-based program. In the hrst approach 
we accept programs which are correct with respect to a constructive se- 
mantics while in the second approach, a stratihcation check is imposed. 
The combination of rules and reactive behaviour, together with a formal 
analysis of this behaviour is the main contribution of our work. 



1 Overview 

The rule-based paradigm for knowledge representation appears in many dis- 
guises within computer science. Language issues related to this paradigm appear 
in production systems [3], parallel program design (e.g. Unity [2]), default rea- 
soning within AI [9], logic programming [1], rewriting [7], active and deductive 
databases [4], and logics for action and change [15]. 

Our work combines results from the three areas of rule-based knowledge rep- 
resentation, reactive systems [11,6], and programming language semantics. The 
combination of rules and reactive behaviour, together with a formal analysis of 
this behaviour is thus the main contribution of our work. Different approaches 
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for specification of real-time and reactive systems range over automata-based, 
temporal logics, Petri nets, action systems, and process algebras. In our view a 
rule-based language with a formal semantics shares the benefits of these specifi- 
cation languages. In addition, it has a special appeal: it mimics the natural mode 
of reasoning by humans in many applications. Therefore, it can be considered as 
a powerful tool for capturing expert knowledge and formally analyzing it. More- 
over, rules can be executed and can therefore be seen as both a specification and 
a programming language. 

The synchronous family of high-level programming languages [5] for real-time 
systems (Lustre, Esterel, Signal) shares the above characteristic. They too can 
be used both for capturing high level design and as executable code. Though 
very different in syntax and style of programming, adding reactiveness to our 
rules leads to formal semantics which is reminiscent of a couple of the proposed 
semantics for Statecharts [14], and Esterel [13]. 



2 Rules and Reactiveness 

A reactive rule-based system (illustrated in Eigure 1) is a system that reacts 
to the changes of its environment continuously [12]. Such a system is composed 
of three entities called state, rules, and inference engine. The state consists 
of slots: state variables, with associated pairs of values indicating the previous 
and the current value of the slot, respectively. During a period when no changes 
happen {equilibrium period, EP), the two values of a slot are identical. At a point 
when there is a change (a stimulus comes from the environment), the current 
value of some slot becomes updated. We call such a moment an asynchronous 
computational point (ACP). At each AGP, the stimulus triggers one or more 
rules, producing new changes in the slots, which in turn trigger other rules, 
and so on. This is continued until no changes are possible, i.e. a steady state is 
reached. Then the system starts ’’resting” in its new EP, awaiting new stimuli. 
The inference engine is in charge of the computations at the ACPs. 




Fig. 1. A reactive rule-based system 
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The rule language RL (syntax can be found in the appendix) is developed to 
express responses of the system at each ACP. The language has been successfully 
used for developing a reactive application: a driver-support system [10]. 

The rules in an RL program have an event-condition-action form, e.g.: 

WHEN A *= a IF (B *= b AND NOT E |= e) THEN D := d; 

read as “When A changes to a then if B changes simultaneously to b and E has 
not been e then D obtains value d”. The WHEN part: A *= a is called the trigger 
part of the rule, the IF part: (B *= b AND NDT E I = e) is called the condition 
part of the rule, and the THEN part D : = d is called the assignment part of the 
rule. The trigger part and the condition part together are called the precondition 
of the rule. The characteristics of this language are: 

— The meaning of a reactive program is independent of the ordering of the 
rules (in case of larger systems rule ordering is a cumbersome and error-prone 
process; the semantics of such programs is unclear and easy to distort). In 
our approach a program can be enhanced by simply adding new rules to the 
existing rule base; 

— The language assumes finite domains for variables (c.f. datalog) allowing a 
finite model; 

— The language allows the logical operations, negation and conjunction; 

— The language allows for taking account of concurrent events (in the example 
rule events A *= a and B *= b occur simultaneously); 

— The language models time flow without introducing metric time (E I = e 
checks if “E has had value e before” , while E *= e checks if “E has changed 
to value e”); 

A rule responds to external stimuli at a given state by checking whether 
the rule is enabled at the current state, and firing the rule (performing the 
assignments) if so is the case. 

A stimulus to a system, denoted as /, is a set of changes which are (slot, 
value) pairs. A state of a rule-based system is a pair (S,C) where S contains 
the values of all the variables (slots), and C contains the set of changes. We use 
Sx to denote the value of x in the latest EP. During an EP, S is the same and 
C = 0. At an ACP, S is the same as S in the previous EP and C contains the 
changes occurring at this ACP including the external stimuli and the changes 
derived as the result of the assignments of the enabled rules. 

A rule r being enabled at a state (S', C) is denoted by (S, C) h r. A rule r 
being not enabled at a state (S, C) is denoted by (S, C) \f r. To check whether 
(S, C) h r, we only need to check if all the primitive preconditions of rule r 
are satisfied at (S, C). By primitive precondition, we mean positive condition 
including X \= v (was), X *= v (changes to), or negative condition including 
NOT X \ = V (was not), NOT X *= v (does not change to). The trigger part of 
a rule contains only one primitive condition X *= v , while the condition part 
of a rule can be a conjunction of primitive conditions. We define h for rules by 
first defining h for primitive conditions of rules, here delimited by [ ]. 
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— {S,C) h [a;|=u] = v; 

— (S', C) h [a;*=u] \S Sx ^ V and (x, v) £ C; 

— negation (NOT) and conjunction (AND) are interpreted as standard logical 
connectives. That is: 

• (S, C) h [ NOT p] where p is any positive primitive iff not (S, C) \~ p. 

• (S, C) h [pi AND P 2 ] where p\ and p 2 are primitive preconditions iff 
(S, C) h Pi and (S, C) \~ P 2 - 

— (S, C) ^ r iff not (S, C) h r. 

Let’s look at a simple example. Suppose x, y, and 2 : are the three slots of the 
system. Let Sx = 0, Sy = 0, Sx = 0 and C = I = {(a;, 1)}. Program PI contains 
only one rule rl: 

rl: WHEN x *= 1 IF y |= 0 THEN z := 1; 

Rule rl is enabled at (S, C) since {x, 1) E C and Sy = 0. The effect of firing this 
rule is to assign 1 to 2 . Therefore, the set of changes becomes Cl = {(a:, 1), {z, 1)}. 
Let’s consider another program P2 containing only r2 with the same (S', C): 

r2: WHEN z *= 1 IF y 1=0 THEN y := 1; 

Rule r2 is not enabled at (S, C) since ( 2 , 1) ^ C. Therefore, the set of changes 
is still {(a:, 1)}. 

If an RL program contains several rules, then the response of the system at 
each AGP may no longer be only one (or zero) firing of rule. There could be 
several rule firings some of which are caused by others. 

3 Synchrony Assumption and Causality 

One might ask why the responses only occur at ACPs. The fundamental as- 
sumption taken here is the synchrony assumption: each response is assumed to 
be synchronous with the effects it causes. This assumption is realistic if the re- 
sponses of the system are fast enough so that the environment does not change 
during the responses (which should be checked in practice). The effects of the 
execution of one component are instantly broadcast to all the other components 
of the system. Therefore, all the components of the system have the same view 
of the system state. 

The smallest component of an RL program is one single rule. If several rules 
get fired at the same AGP, then all the rule firings are considered to occur at 
the same time. We don’t care how the rule firings are done step by step if only 
synchrony requirement is considered. What is interesting is only the result of the 
response. The result of a response at (S', 7) is a stable state (S, C") and a set of 
fired rules R-f where: 

— C' is the result of firing all the rules in 7?-f at the given initial state (S, 7). 
Let Ar denote the assignments of rule r. Then 

C"= IJ ArUl. 

(S, C) is seen as the state after the response. 
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— R-f is the maximal set of rules that are enabled at state (S', C'). First, all the 
rules in R-f are enabled at (S, C"). Second, no other rules not belonging to 
R-f are enabled at (S, C"). 

However, we would like to retain causality which is a very important property 
for a reasoning system. The principle of causality requires that any change issued 
should have a sequence of (enabled) rule firings leading to it. The following exam- 
ple shows a causal reasoning. By composing earlier programs PI and P2, we get a 
new program P3 which contains two rules: rl and r2. One can infer that both rl 
and r2 are fired and the new set of changes becomes C3 = {(a:, 1), (y, 1), {z, 1)}. 
The reasoning is simple. Since rl is enabled at {S,I), rl is fired and the effect: 
the change {z, 1) is instantaneously broadcast. The system state becomes {S, Cl) 
where Cl = {(a:, 1), {z, 1)}. Since r2 is enabled at {S, Cl), r2 is also fired and 
results in the final set of changes C3. 

For the above example, C = C3 and R^ = {rl,r2}. The synchrony require- 
ment is also satisfied since C3 = I iJAri U^r 2 , and rl and r2 are the only rules 
enabled at (S', C3). 

However, not all the responses respect both synchrony hypothesis and the 
principle of causality. Let’s look at two examples. 

Given S where Sx = Q,Sy = D,Sz = I = {(y, 1)} and a program with two 
rules r3 and r4, what are the final state and the fired rule set? 

r3: WHEN x *= 1 IF y |= 0 THEN z := 1; 
r4: WHEN z *= 1 IF y |= 0 THEN x := 1; 

There are two solutions which satisfy the synchrony requirement. One is C = I 
and R-^ = 0. The other is C = {(y, 1), (x, 1), (z, 1)} and R-^ = {r3,r4}. The 
problem with the second solution is that without the firing of r4, r3 can not get 
fired. The same is for r4: without the firing of r3, r4 can not get fired. The result 
is self-triggered. Or, in other words, it is not causal since we can not generate 
this final result via a causal sequence of rule firings. 

The above example shows that not all the responses satisfying synchrony 
requirement are causal. Next, we show that not all the causal responses satisfy 
the synchrony requirement either. 

Suppose (S', I) be Sx = 0, Sy = 0, Sz = 0, I = {(y, 1)}, and a program be as 
follows. 

r5: WHEN y *= 1 IF NOT x *= 1 THEN z := 1; 
r6: WHEN y *= 1 IF x |= 0 THEN x := 1; 

A causal rule firing sequence is r5 followed by r6 which results in 

C' = {(y, 1), {x, 1), (z, 1)}. The problem is that r5 is not enabled at (S, C') which 

violates the synchrony requirement. 

4 Other Requirements 

As we deal with variables, one important requirement is not to assign different 
values to the same variable at the same AGP. Another requirement is that there 
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should be only one final result at each AGP. This requirement is understood as 
observable determinism. 

Next, we provide a desired semantics definition for a response which respects 
the synchrony hypothesis, the principle of causality and the above requirements. 



5 Declarative Semantics 



Definition 1. Suppose R is the set of rules of a program P. The declarative 
response of the program P in a state (S,I) is any seguence of firings 

(Jodi . . . 



sueh that 

- ao = (Co,i?iJ)=(J,0), 
~ O'i+l = R{_^i) 



{Ci U Arf,P{ U {r/}) where r/ G R={r\r £ R \ R{ A {S, Ci) h r} 

ifRf^ 

Oi if R=% 



□ 



In the definition, each firing [oi) contains a set of changes {Cf and a set of fired 
rules ). 

It can be proved that a declarative response has always a finite length [8] . 



Definition 2. Let R be the rule set in a program P. Let a deelarative response of 
the program in a state (S', 0) to a stimulus I be a^ai . . Let = [Cm,R^) 
and R^ = {ri, r 2 , . . . , rm}. The deelarative response is correct if and only if 

— the response is rule-consistent.- 

Vr(r e R^ ^ (S, Cm) b r) 



that is, none of the rules fired in this response will become disabled after the 
final firing; 

— the response is slot-consistent.- 



Va;((a;, m) £ Cm A {x, V 2 ) £ Cm vi = V 2 ) 

that is, no slot can have more than one ehange of value in this response; 

— the response is unambiguous.- for any other declarative response aoa[ . . . cr(, 
with a'f, = [C'^,R'^) that is both rule-consistent and slot- consistent, we have 

A correct response is the desired response. This semantics is referred to as 
declarative semantics. An RL program is correct if and only if it has a correct 
response for any possible combination of state and stimuli. Two natural questions 
arise: 
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— Can we construct an operational semantics to implement the desired declar- 
ative semantics? 

— Can we identify the ill-behaved programs during compile time without hav- 
ing to generate all the responses for each state-and-stimulus combination? 

We will devote the next two sections to answering the above questions. 

6 Constructive Semantics 

6.1 The Semantics 

Constructive semantics is an application of the three-valued-logic approach to 
non-monotonic reasoning in the setting of reactive systems. It also resembles 
the recently proposed semantics for pure Esterel [13]. The main differences are 
in the structure of programs (rule-based in our case, imperative in the case of 
Esterel), and the means of communication (change in slot values in our case, pure 
signals/events in Esterel). In what follows we present the constructive semantics. 

The constructive semantics needs not only positive information about the 
changes of the system, but also negative information about the lack of changes. 
In constructive semantics, we deal with extended system state {S,Z) where S 
records the values of the slots before the ACP and Z contains a set of annotated 
changes where each (slot, value) pair has a annotation indicating the status of 
this change. The status is an element from the set {-f, — , T}. + is read as positive, 
and {x, u)+ means that {x, v) does occur in this ACP; — is read as negative, 
and {x, v)^ means that the change {x, v) can not possibly occur in this ACP; T 
is read as Unknown, and {x, v)^ means that the change of a; to u is not present 
yet at this point of the computation, but it is not sure whether it will take place 
later. 

The result of evaluation of a rule is one of the following: True, False or Un- 
known instead of only True or False as in 2-valued logic. The evaluation evaluates 
a rule to be Unknown if it is not known whether the rule will evaluate to true or 
false after this response. More specifically, a primitive condition (NOT x *= v) 
is evaluated to be True at a state {S, C) if {x, v) does not belong to C when rea- 
soning under 2-valued logic, but Unknown in the case of constructive semantics 
if {x, v) is not explicitly marked with unchangeable status (positive or negative). 
The ordering between the status annotations is 

{(-*-)-)) (-L, +)) (-L, T), (-,-),(+, +)}. 

Let Z, Z' be two sets of annotated changes. Z is less informative than Z' , denoted 
Z ^ Z' and only if 

(V(a:, u)“ € Z){3a'){{x, u)“' eZ' Fa < of). 

Given C, is defined as the extension of C where: 

C^={{x, u)+ I {x, u) e C} U {(x, v')^ \ {x,v) & C /\v' ^ v} \J 

U {(x, u)-*- I yv'{x,v') C} 
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Symmetrically, given Z, Z is defined as the reduction of Z where: 
Z-={{x,v) I (a:,u)+ G Z}. 

A rule being 3-enabled at an extended state {S, Z) is denoted by (S', Z) I-3 r. 
A rule being 3-non-enabled at an extended state (S, Z) is denoted by (S, Z) I/3 r. 
(S, Z) I-3 r if and only if all the primitive preconditions are evaluated to be 
True at {S,Z). {S,Z) I/3 r if and only if one of the primitive precondition is 
evaluated to be False at (S, Z). The evaluation of a primitive condition p at a 
given extended state (S, Z) is shown as follows: 

— [x I =u] is True if Sx = v, 

[x I =u] is False if Sx 7^ u; 

— [a;=t==u] is True if (x, u)+ G Z and Sx 7^ v, 

[a;*=u] is False if (x, v)~ ^ Z ov Sx = v, 

— [NOT p] is True if p is False; [NOT p] is False if p is True; 

— With the exception of [True] , all other primitive conditions are evaluated to 
Unknown. 

It should be observed that there are intermediate cases when neither (S', Z) I-3 
r nor (S, Z) I/3 r is true. 

The negative changes are derived by function never. Function never works 
iteratively. At each iteration, a negative change is added into the set of annotated 
changes. The change added has one of the following characteristics: 

— No rule in the program can issue such change. 

— All the rules that can issue such change are 3-non-enabled at the current 
extended state. 

When we say adding negative changes or positive changes, we mean updating 
the annotation of the (slot, value) pair in the set of annotated changes. This is 
done by update function. The annotation can only be changed from _L to + 
or — . An attempt to change the status from -k to — or vice versa indicates a 
symptom of slot-inconsistency. When such situation occurs, the set of annotated 
changes returned is an empty set to indicate failure. The formal definition for 
never and update can be found in [8]. 

We are now in a position to define an operational semantics. 

Definition 3. Given a program P with a rule set R, an initial system state 
{S, 0), and a stimulus I, the constructive response of the program is a sequenee 



ToTi • • • 
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such that 

— 7o = (^0)0); where Zq = never (S', /+, i?), 

“ 7*+l ~ ^i+l) 

ifZi=tb, 

_ ^ (never(S, update(Zi, )), -R), i?/ U{r/}) if Zi^% and 

Tf e Ri 0, 

where Ri = {r|r e R \ R{ A {S, Zf) I-3 r}. □ 

As we can see, if there exists an unfired rule 3-enabled in the current state, 
and no slot-inconsistency occurred in the update of the previous step {Zi 7^ 0), 
then the current set of annotated changes Zi is updated with positive changes, 
and negative changes. The positive changes come either from the external stim- 
ulus (step 0) or from the assignments of the selected rule that is 3-enabled at 
the current extended state (subsequent steps). The negative changes derived by 
never function are those potential negative changes that could be deduced from 
the current state. If there is no unfired rule 3-enabled by the current state, then 
the procedure returns the same tuple as in the previous step. Finally, if the state 
indicates the occurrence of slot- inconsistency (Z^ = 0), the procedure returns 
the empty set as the new set of annotated changes. 

We say that the constructive response terminates at Z^ if and only if {Zm = 
0 and Zm-i 7^ Zm)or (Rm = 0 and Rm-i 0), that is, a slot-inconsistency 
occurs or there is no rule to be selected. 

A terminating constructive response is accepted if and only if it terminates at 
Zm and Zm 0 and (V(a:, u)“ G Zm)(a 7^ T). That is, an accepted constructive 
response terminates normally, meaning that no slot-inconsistency occurs {Zm 7^ 
0), and the set of annotated changes of its final state is complete, meaning that 
no change in Zm is marked with T. 

6.2 Properties 

It can be proved that given a program P and {S, I), all the constructive responses 
reach the same final set of annotated changes (the thereom can be found in [8]). 

It can also be proved that any accepted constructive response yields a correct 
declarative response. In order to prove this, we first define a mapping from an 
accepted constructive response to a sequence of firings and then prove that this 
sequence is a construction of a declarative response (see [8]). Then, we prove 
that this declarative response is a correct one (see theorem 1). 

Definition 4. Let CR = 7071 ■ ■ ■ 7m be an accepted constructive response, where 
ji = {Zi, r{) ,0 < i < m. Then 

map(C'i?)=CTo,CTi, . . .,am, 

where ai = {Zff , rI) . □ 
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Theorem 1. (Soundness) If DR = map(C'i?) is a declarative response obtained 
from an accepted constructive response CR, then DR is correct. □ 

The proof can be found in [8]. 

The static checker performs an exhaustive check of acceptability of the re- 
sponses for all possible states and stimuli. It can be easily proved that that all 
the programs passing the constructive check procedure are correct ones with 
respect to the desired declarative semantics. 

7 Stratified Program 

Stratified program is a well-known notion in logic programming and deductive 
databases. It was an early attempt to deal with dependencies between relations 
in presence of negation. The fixpoint computation along strata gives this class 
of programs a natural semantics. We introduce the idea of stratification into 
reactive rule-based systems to achieve rule-consistency. An arbitrary declarative 
response is not necessarily rule-consistent since a condition (NOT x *= v) of a 
rule r can be disabled by firing other rules after r, which may generate {x, v). By 
firing rules in a stratified order, this kind of situation can be avoided. Working 
with stratified rule sets has the following effect: every time a rule which has 
a condition part including negation over [s*=u] is tested for being enabled, we 
can be sure that a rule with an assignment u to s has been fired earlier in the 
response (if it is included in the final fired rule set of this response at all). 

Note, however, that the user needs not explicitly consider these dependen- 
cies when introducing rules. The support at compile time is supposed to check 
whether such a stratification exists. Given a program P and a pair {x,v), the 
definition of {x, v) is the set of rules in whose assignment part {x, v) appears. 

A stratified rule-based program consists of a disjoint set of rules P = U 
. . . U P* U . . . U P^ called strata. If a program is stratifiable, its stratification is 
constructed as follows: 

— If a positive pair [a;*= u] appears in the trigger part or condition part of a 
rule from P{, then its definition is contained within Uj<i -fji 

— If a negative pair [NOT a;*=ii] appears in the condition part of a rule from 
Pi, then its definition is contained within Uj<i Pj- 

For a given a stratified correct program, the responses generated by such 
operational semantics are correct if they are slot-consistent. Unfortunately, for a 
stratified program this operational semantics does not guarantee slot-consistency. 
Stratification simply provides a sufficient condition for rule-consistency. 

8 Summary 

The technical results obtained in our research can be summarized as follows: 

— We have defined a rule-based language RL that combines asynchronous in- 
teraction with an environment with synchronous treatment of a response. 
Time and concurrency are thus dealt with in a simple manner; 
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— For this language we have defined a declarative semantics which enables a 
natural treatment of causality, atomicity, and desired determinism; 

— We have defined a correctness criterion for reactive RL programs. A correct 
program ensures termination of rule firings at each reaction, consistency of 
the fired rules and a unique reaction for each new set of stimuli to the system; 

— We have defined and implemented constructive semantics, based on three- 
valued evaluation of rules, that guarantees the correct results of computa- 
tions for correct programs; 

— We have developed and implemented a static procedure for checking the 
correctness of programs; 

— We have proven soundness of the obtained results; 

— For stratified programs we have developed the computational support which 
guarantees correctness w.r.t. one particular consistency requirement. 
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A Appendix: Syntax 



The syntax for RL is defined as follows. 
Definition 5. A rule is a string 



WHEN <rtrig> IF <rcond> THEN <rassign> 
fulfilling the requirements of the following grammar: 



<slotval> 

condA 






assign ^ 

< assignment-list> 

<assignment> 

<slot-name> 



::= <slot-name> *= <slotval> 

::= <ident> 

^r condA AND ^r literal'^ 

I ^'^literal'^ 

I TRUE 

::= NOT ^ruterai^ 

I <slot-name> *= <slotval> 

I <slot-name> 1= <slotval> 

::= <assignment> I { <assignment-list> } 

::= <assignment> I <assignment-list> , <assignment> 
::= <slot-name> := <slotval> 

::= <ident> 



where <ident> denotes an identifier. 
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Abstract. We present a proof system for verifying CCS processes in the 
modal /i-calculus. Its novelty lies in the generality of the proof judge- 
ments allowing parametric and compositional reasoning in this complex 
setting. This is achieved, in part, by the use of explicit fixed point ordinal 
approximations, and in part by a complete separation, following an ap- 
proach by Simpson, of rules concerning the logic from the rules encoding 
the operational semantics of the process language. 



1 Introduction 

In a number of recent papers [1,2, 3, 4, 9] proof-theoretical frameworks for com- 
positional verification have been put forward based on Gentzen-style sequents 
of the shape F \~ A, where the components of F and A are correctness asser- 
tions F : (j). Several programming or modelling languages have been considered, 
including GGS [3], the rr-calculus [2], CHOCS [1], general GSOS-definable lan- 
guages [9], and even a significant core fragment of a real programming language, 
Erlang [4]. An important precursor to the above papers is [10] which used ternary 
sequents to build compositional proof systems for CCS and SCCS vs. Hennessy- 
Milner logic [6]. 

A key idea is that the use of a general sequent format allows correctness 
properties P : (p to he stated and proved in a parametric fashion. That is, cor- 
rectness statements of a composite program P(Qi, Q 2 ), say, can be relativized 
to correctness statements of the components, Qi,Q 2 - A general rule of subterm 
cut 

FhQ:tlj,A F,x : tp \- P : (f),A , . 

Fh P[Q/x]: (P,A ^ 

allows such subterm assumptions to be introduced and used for compositional 
verification. 

It is, however, difficult to support temporal properties within such a frame- 
work. As is well known [12], logics like LTL, CTL, or CTL* are poorly equipped 
for compositional reasoning without resort to devices like history or prophecy 
variables. For this reason, our investigations have tended to focus on logics based, 
in some form, on the modal /r-calculus in which the recursive properties needed 
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for property decomposition can more adequately be expressed. In [3] the first 
author showed one way of realizing a proof system using the subterm cut rule, 
and built, for the first time, a compositional proof system capable of handling 
general CCS terms, including those that create new processes dynamically. In [4] 
we used a similar, though considerably improved, approach to address Erlang. 
The approach of [3] suffered from two main shortcomings, however: 

1. Though systematic, the embedding of the CCS operational semantics into 
the proof system was indirect, and allowed only rather weak completeness 
results to be obtained. 

2. The handling of recursive formulas was very syntactic and hedged by com- 
plicated side conditions, hiding the essence of our proof-theoretical approach 
from view. 

In this paper both these issues are addressed. First, following an idea by Simp- 
son [9] we fully separate the embedding of the transitional semantics for P from 
the general handling of the logic by employing process variables and transition 
assertions of the shape P ^ Q. These assertions provide a semantically explicit 
bridge between the transitions of P and the one-step modalities of the logic. A 
similar approach is used to handle the second complication. The essential diffi- 
culty is that, to be sound, rates of progress for hxed point formulas appearing 
in different places in a sequent must be related. To achieve this in a simple and 
semantically explicit way we employ hxed point approximations using ordinal 
variables, and ordinal constraints of the shape ki < K 2 - 

In the paper we hrst introduce the modal /x-calculus with explicit ordinal 
approximations, and we introduce the basic form of judgment used in the proof 
system. In the absence of process structure such as CCS, models are just standard 
Kripke models. Correspondingly, the proof system in this case can be seen to 
provide an account of Gentzen-style logical entailment. The novelty, in this case, 
lies in the use of ordinal approximations. This fragment of the proof system 
is introduced in Sect. 3. The key ingredient to release the power of this proof 
system is a rule of discharge, or termination, which recognizes proofs by well- 
founded induction. In another paper [5] we introduce a game which embodies 
such a rule, and show completeness of the resulting proof system by reduction to 
Kozen’s well-known axiomatization [7]. A practical rule of discharge, however, 
must be local which the game condition of [5] is not. Here, instead, we introduce 
a local version of the discharge rule which is, we believe, a simple and intuitive 
approximation of the complete global condition. This local discharge rule is 
introduced (summarily, in this abstract) in Section 4. A full instantiation of our 
approach to CCS requires in addition an embedding of the CCS operational 
semantics into the present Gentzen-style format (following Simpson [9]) plus the 
subterm cut rule (1). This extension is shown in Section 5, and then in Section 
6 we give a rough sketch of a correctness proof of a simple inhnite state CCS 
process. 
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2 Logic 

Formulas (f) are generated by the following grammar, where k ranges over a set 
of ordinal variables, a over a set of actions, and X over a set of propositional 
variables. 

4> ::= 4> V 4> I -<4> I {a)4> | X | pX.cf) | {p,X.(j>Y' 

An occurrence of a subformula V’ in is positive, if Y appears in the scope of 
an even number of negation symbols. Otherwise the occurrence is negative. The 
formation of least fixed point formulas of one of the shapes pX.cf) or {pX.cf))'^ 
is subject to the usual formal monotonicity condition that occurrences of X in 
4> are positive. We use the symbols U and V to range over (unindexed) fixed 
point formulas p,X.<p. A formula Y is propositionally closed if Y does not have 
free ocurrences of propositional variables. Standard abbreviations apply: 

false = pX.X, 
true = ^false, 

(f) A'tp = “'(“'0 V -'V'), 

[a](p = 

vX.(f> = —i/iA.— '(())[— I A"/A"]) 

We assume the standard modal /x-calculus semantics [7]: 

1/9 = ||())|j/9U IIV'll/9 

\p = s\U\\p 

\p={F\3Qe\mp.F^Q} 

\P = P(X) 

\p = f]{s\sc\\Y\\p[s/x]} 

augmented by the clause: 

r 0 if p{k) = 0 

\\{px.Yr\\p = { \mp[\\ipx.Yr\\p/x,/3/K] ifp{n) = /3+i 

[ [J{\\{pX .<l))'^\\p[f3 / k] I /? < p{k)} if p{n) is a limit ordinal 

where p is an interpretation function (environment), mapping ordinal variables 
to ordinals, and propositional variables to sets of closed process terms, or states, 
from a domain S ranged over by F. 

The use of ordinal approximation hinges on the following results (of which 
(1) is the well-known Knaster-Tarski fixed point theorem). 

Theorem 1. 

1. \\pX.(l)\\p = \J^\\{pX.(l)Y\\p[/3/K] 

2. ||(/iW.(^)'"||/9= ||(^||/9[||(/iX.(^)'"||/9/X,/?/K] 



WYy Y\ 
lh<M 

11^ 

ll/iA.01 
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Observe how this casts the properties U and as existential properties: 
This is useful to motivate the proof rules for fixed point formulas given below. 
Observe also that, for countable models, quantification over countable ordinals 
in Theorem 1 suffices. In the definition below, we extend interpretation functions 
p to map process variables x to closed process terms (states). 

Definition 1 (Assertions, Judgements). 

1. An assertion is an expression of one of the forms E : 4>, n < k' , or E ^ E, 
where E, E are a proeess terms and f is a propositionally elosed formula. 

2. The assertion E : 4> is valid for an interpretation function p (written E \=p 
f), if Ep e WfWp. The assertion k < k' is valid for p, if p{E) < p{E). The 
assertion E A E is valid for p, if Ep A Ep is a valid transition. 

3. A sequent is an expression of the form E h A, where T and A are sets of 
assertions. 

4 . The sequent T \~ A is valid (written T \= A), if for all interpretation func- 
tions p, all assertions in T are valid for p only if some assertion in A is 
valid for p as well. 

An assertion of the shape E : f is called a property assertion, an assertion of 
the shape k < k' is called an ordinal constraint, and an assertion of the shape 
E E is called a transition assertion. 



3 Proof System: Logical Entailment 

We first consider the problem of logical entailment. In this case, process terms 
E in assertions of the shape E : 4> are variables. 



Structural Rules. We assume the axiom rule, the rule of cut, and weakening: 



Ax ^ , , — -. — ^ 
r,A\- A,A 



Cut 



EhA,A T,AhA 
r\- A 



W-L 



Th A 
r,Ah A 



W-R 



r h z\ 

Eh A, A 



As in [9], in the axiom rule assertion A needs only be instantiated to transition 
assertions, and then A can be assumed to be empty. Since E and A are sets, 
structural rules like permutation and contraction are vacuous. We conjecture 
that both cut and the weakening rules are admissible. 
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Logical Rules. In the following listing we assume that U = 

r\- A r,E-.(l>h A 

r,E:^(f>h A Eh E:^(f>, A 

r,E-.^hA r,E-.'^hA Eh E-.<p,E-.ij,A 

r,if:0VV'hZ\ E^ E : (pv tp,A 



(a)-L 



E, E ^ x,x ■. (f) \- A 
E,E : {a)(p h Z\ 



fresh{x) 



(q;)-R 



ELE^E',A EhE':<l),A 
E\- E : {a)<l),A 



U-L 



E,E A 
E,E :U \- A 



fresh{K) 



EL E: <I)[U/X],A 
EL E :U,A 



U'^-L- 

U^-R 



E,E<n,E:c^[U-'/X]LA 
E,E:U^LA ^ 



ELE<k,A el E : /X],A 

EL E : U^,A 



The side condition fresh{x) {fresh{K)) is intended to mean that x (k) does not 
appear freely in the conclusion of the rule. 

The rules for unindexed and indexed fixed point formulas are directly moti- 
vated by Theorem 1. The lack of symmetry between rules ?7-L and f/-R is not 
accidental; their symmetric counterparts are in fact admissable. 



Ordinal Constraints. Finally, we need to provide rules for reasoning about ordi- 
nal constraints. 

^ E,k'<kL k" < E, A 

Oral r ; 77 j- 

E, K < K L K < K, A 

This rule is sufficient provided ordinal variables and constraints are only being 
introduced during the proof, i.e., do not appear in the root sequent. 

Theorem 2 (Local Soundness). All rules for logical entailment are individ- 
ually sound: Each rule ’s conclusion is valid whenever its premises are valid. 



4 Proof System: Rule of Discharge 

Processes and formulas can be recursive, allowing for proof trees to grow un- 
boundedly. Intuitively, one would like to terminate an open branch whenever 
a “repeating” sequent is reached, i.e. a sequent which is an instance, up to 
some substitution a, of one of its ancestors, its “companion”, in the proof tree. 
A proof structure, all leaf nodes of which are either axioms or such repeating 
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nodes, serves as the basis for well-founded ordinal induction arguments. A global 
discharge condition is a sufficient condition for such an argument to be valid. In 
case a global discharge condition applies all leaves which are not axioms can be 
considered induction hypothesis instances in some, possibly deeply nested, proof 
by well-founded induction. 

The use of ordinal variables and constraints allows global discharge conditions 
to be phrased in a clear and semantically transparent way. The most general view 
of discharge is presented in game-based terms elsewhere [5]. In essence, global 
discharge guarantees well-foundedness of proofs: That along every infinite path 
in the infinitely unfolded proof tree, ordinal constraints grow downwards in an 
unbounded manner. 

Here we present a discharge condition which is, in contrast to the global 
condition of [5], more local, and easier to understand and apply. Moreover, even 
though it is in general incomplete, it is, in our experience, adequate in a great 
many situations. In particular it is powerful enough to handle the example con- 
sidered below. 

First a single piece of terminology: Two repeat nodes are called related if they 
are in the same strongly connected component in the directed graph obtained 
from the proof structure by identifying the repeat nodes with their companions. 

Definition 2 (Rule of Discharge). A node labelled F \~ A can be discharged 
with and substitution a against an aneestor node labelled F' h A' if: 

(i) occurs as subformula in F' or A'; 

(a) (pa e F whenever (p G F' , and (pa e A whenever (p G A' ; 

(Hi) F na < K is derivable; 

(iv) assuming the related diseharge nodes labelled Fi\~ Ai . . . h A^ have been 
discharged with and ai . . .an against F{ b Z\( . . .Ff \~ A'n, 

there is a linear ordering -< on these diseharge nodes ineluding the present 
node, sueh that whenever i A j: (a) Uf' occurs as subformula in F) or Aj, 
and (b) either Kiaj = Ki, or Fj h Kiaj < Ki is derivable. 

In clause (iv), the linear ordering can be chosen differently each time the rule is 
applied (and a new node is added to the corresponding class of related discharge 
nodes). The purpose of the clause is to guarantee that along every infinite path 
in the infinitely unfolded proof tree, ordinal constraints grow downwards in an 
unbounded manner. 

Theorem 3 (Soundness). The proof system including the rules for logieal en- 
tailment and the rule of diseharge is sound: All sequents derivable in the proof 
system are valid. 

For finite state labelled transition systems the above proof system reduces to 
an ordinary model checker like the one presented in [11], and is hence complete 
for such systems. In general, however, due to undecidability of the model checking 
problem addressed here, the system is necessarily incomplete. 
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5 Proof System: Operational Semantics 

Having transition assertions allows the transitional semantics of a process lan- 
guage to be embedded directly into the proof system as a separate set of proof 
rules. This can be done in a straightforward manner for any GSOS-definable 
language [9]. Here we illustrate this approach on a well-known process language, 
Milner’s Calculus of Communicating Systems [8]. 

We assume that CCS process terms E are generated by the following gram- 
mar, where I ranges over a given set of labels, L over subsets of this set of labels, 
a over actions of the shape r, I or I, and x over a set of process variables. 

E ::= 0 I a.E | if + if | if|if | E\L | x | fixx.E 

The set of states S used in Section 2 is the set of all closed process terms. The 
operational semantics of CCS is given as a closure relation on processes through 
a set of transition rules [8]: the transitions that a CCS process can perform 
are exactly those derivable by these rules. Hence, the transition rules can be 
included directly as right introduction rules into our proof system, while the 
left introduction rules (stating what transitions are not possible), come from the 
closure assumption. 

We present only the most significant of the resulting rules, and in particular 
the ones used in the Example to follow. 



0-L 



r,o A X h z\ 



o-R 



r h a.E ^ E, A 



r\E/x] h A\E/x] 

a-L-1 1 .[ a . ' o-L-2 



r, a.E x\- A 



a ^ j3 



+-L 



r, a.E X A 
r[y/x],E y A[y/x] E[z/x],E z\~ A[zlx] 



+-R 



r,E + E ^ x'r A 
ry^E^E^A 



l-R-1 



l-R-2 



E',A 



E'rE^E'.A .^^r^E^E' r\-E^E',A 



l-L-1 



Eh E\E ^ E'\F,A ' E\F ^ E'\F',A 

r[y\F/x], E -’a y A[y\F/x] r[E\z/x], F -’a z\~ A[E\z/x] 
r,E\F^xh A 



r[yi\F/x],E ^ yi h A[yi\F/x] 
r[E\y2/x],F ^ y2 h A[E\y2/x] 
r[zi\z2/x],h = h,E zi,E ^ Z2\- A[zi\z2/x] 
r, E\F X \- A 
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fix-h 



r, E[fix x.E/x 



y \- A r \- E[fix x.E /x\ ^ E' , A 

y\- A rV- fixx.E ^ E',A 



r,fix x.E 

In addition to these rules, a subterm cut rule is needed to allow for parametric 
and compositional reasoning: 

r \- F ■. %l^,A r,x : iIj \~ E : (f), A 



SubtermCut-R 



r h E[F/x] : 4>,A 



fresh{x) 



6 Example 

Consider a process 

Counter = fixx. up. (x | down.x) 

which can alternatingly engage in up and down actions, generating a new copy 
of itself after each up action. Clearly, in any point in time, regardless how many 
counters have already been spawned, this system can engage in finite sequences 
of consecutive down actions only. This propery can be formalised as the negation 
of the following formula: 

(j) = jxX. -1 jjY. -1 {{up)X V {down)^ Y) 

So, we want to prove validity of the sequent 

h Counter : -i 

We perform the proof backwards, from this goal sequent towards the axioms, 
guided by the shape of the formulas and process terms involved. After eliminating 
the negation and approximating 4> one obtains 

Counter : <j)"' h (2) 

Continuing in the same straightforward manner we soon arrive at the following 
two sequents: 



n! < K, up. {Counter \ down. Counter) x\~ x : 'ifj 

k' < K, Counter \ down. Counter : <j)'^ h 

the first of which is an axiom. The second sequent is similar to sequent (2), with 
the important difference of a new down. Counter component having appeared. 
This is the point where one would like to perform an inductive argument on the 
system structure, and this can be done using subterm cut. The most important 
question is what the property of the component being cut is that yields the overall 
system property being verified. A convenient case is when it is the same property, 
i.e., when the property being verified composes nicely. This is the case in our 
example, partly because there is no communication between the components. 
So, after two applications of subterm cut we obtain the following three sequents: 
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k' < K, Counter : h 

k' < K, down. Counter : (f>'^ \~ x : 
k' < K,x\y : (f)'^ \- X : (f)'^ ,y : (f)'^ 

the first of which can be discharged with (f>'^ and substitution [k h- > C] against (2). 
Notice that this node has no related discharge nodes (so far), so only clauses (i) — 
{Hi) of the Rule of Discharge have to be checked in this case. The second sequent 
is easily reduced to an axiom and a discharge node. Handling the remaining 
sequent is only slightly more involved. 

7 Conclusion 

We presented a proof system for verifying CCS processes in the modal /x-calculus. 
Its novelty lies in the generality of the proof judgements allowing parametric and 
compositional reasoning, in the complex setting of this powerful logic. This is 
achieved, in part, by the use of explicit fixed point ordinal approximations, and in 
part by a complete separation, following Simpson [9], in the proof system of the 
rules concerning the logic from the rules encoding the operational semantics of 
the process language (here CCS). This makes the proof system easily adaptable 
to other languages with a clean transitional semantics. 
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Abstract. We review the use of categories with products as a vehicle 
for the construction of bit-level functions that correspond to combina- 
tional circuits. Further we show that results from the category theory 
concerning list homomorphisms can help in our search for a computa- 
tional model that captures the desirable properties of digital circuits, 
namely locality of communications and simple, repetitive structure of 
computational components. We demonstrate applications of the theory 
to some simple problems. 



1 Introduction 

New reconfigurable computing technology redefines the traditional hardware/ 
software boundary and enables the rapid realization of algorithm-specific hard- 
ware architectures at a low-cost base, such as Field Programmable Gate Arrays 
(FPGA) [3]. We want to develop a rigorous methodology for creating a range of 
application-specific high-level languages that can be compiled directly into FP- 
GAs. Two crucial issues should be captured by our approach, namely hardware- 
independent software development and efficient compilation of programs into 
digital circuits. Moreover, since applications that profit from reconfigurable com- 
puting usually have high degree of parallelism and/or concurrency, and involve 
additional design decisions regarding decomposition, communication, routing, 
timing, etc., a rigorous methodology is needed for development of such pro- 
grams. This suggest that a good level of abstraction must be mathematically 
based, so that reasoning and formal development are possible. 

The categorical data type (CDT) approach [8] is an extension of the abstract 
data type in a way that appears to be particularly useful for parallel compu- 
tations [10,5]. GDTs have operations, equations relating them, and a guarantee 
that all of the required operations and equations are present. A theory of GDTs 
is a theory of algebraic structures that behave like the constructed data type, 
and homomorphisms among them. The important property of homomorphic 
operations is that the pattern of computations follows the structure of the ar- 
gument. Thus, for homomorphic functions, locality of communication, regularity 
and partitionability of computation are inherent properties [5,6]. 

Many polymorphic higher-order functions are homomorphisms or almost ho- 
momorphisms. Consider functionals like map, that applies some function to all 
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individual elements of a data aggregate independently, producing a data aggre- 
gate with new values of individual elements, or reduce, that calculates a “cu- 
mulative sum” of all elements in a data aggregate. These functionals can be 
defined over different data types, such as cons [2] and cone [7] lists, homoge- 
neous binary trees [7], arrays [1], etc. While data types vary from application 
to application (i.e., arrays are appropriate for image processing, trees for divide 
and conquer algorithms, streams for signal processing, etc.), the general patterns 
of computation on these types are the same. 

The main idea is that for any given application a set of appropriate basic 
data types is defined in terms of categories with products, and operations on 
these types are compiled directly into blocks of FPGA logic cells. Then higher- 
order functions, or operations for application specific data aggregates are defined 
within the scope of a CDT and implemented as templates for composition of basic 
primitives. Having been carefully chosen so as to satisfy constraints of the FPGA 
technology, these operations ensure an efficient implementation of an application 
on FPGA-based hardware. In other words, an ability to express an application in 
terms of a composition of the set of predefined higher-order functions on specific 
data aggregates is a test-bed for an efficient implementation of the application. 

2 Categories with Products 

To describe and analyse combinational circuits (i.e., circuits without latches and 
feedbacks) we use a category with products. Let B = {0, 1}. Gonsider cate- 
gory Circ [8] with objects B°, B^, B^,... and arrows that represent all func- 
tions between these sets. Notice that B*’ = {*}, or unit object, B^ = B, and 
B" = {{xi,X 2 , ■■■,Xn) : Xi G B} for n > 1. In a category, each morphism has 
a designated domain and codomain in objects; any object A has an identity 
morphism 

U : A^A, 

and for any given morphisms f : A ^ B, g : B ^ C, h : C ^ D, there is 
a designated composition of morphisms which satisfies identity and associative 
laws: 

gof:A^C; o / = / = / o 1^, h o {g o f) = {h o g) o f : A ^ D. 

The product of B™ and B” is with the following projections: 

Pi : B™+" ^ B™, p2 : B"'+" ^ B" 

such that {xl,X 2 ,...,Xm,--,Xm+n) ^ {Xl , X 2 , ■ ■■ , Xm) , and {xl,X 2 ,...,Xm,--, 
Xm+n) I— >■ {xm+it ■■■jXm+n)- Two functions from B° to B^ are constants true 
and false. Some further interesting functions in this category are: negation, 
-I : B — ^ B, logical and, & : B^ — ^ B, logical or, OR : B^ — ^ B, and 
excluded or, XOR : B^ — ^ B. Two useful unary fuctions =i and =o check the 
equality of the argument to 1 and 0 respectively. It is easy to see that (= i) = (Is) 
and (=o) = (-'). 
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— In a category with products, we can define a parallel composition of two 
functions. Given / : X\ — ^ Yi and g : X2 — ^ I2, a parallel composition 
is a function f x g : Xi x X2 — ^ Yi x Y2 which maps (xi,X2) into a pair 
{f{xi),g{x2))- This function obeys the laws of projection: 

Pn o (/ X fir) = / opxi, PY2 o (/ X fir) = / opx2- 

— A diagonal function A which produces two copies of its input can also be 
defined in a category with products: Zix : X — ^XxX, suet that a; r—> (x,x). 

Pi o Ax = lx, P 2 oAx = 1x- 

— Function twist : X x Y — ^ Y x X interchanges its two inputs : {x, y) 1— > 
(y, x). If Pi , p2 and qi , p2 are projections of X x F and YxX respectedly, then 

qi o twist = P2, 52 o twist = pi . 

It is known [ 8 ] that in category Circ one can construct any logical function 
starting with true, false, not, and, or, identity maps, and projections using only 
compositions and the property of products. Moreover, any such function can 
be implemented using a circuit without latches, consisting of wires and gates. 
Indeed, the set B is the set of possible states of each wire. The functions &, OR 
and XOR are implemented directly as gates, as shown in Fig.l. Identity map(s) 
1 b", n > 1 and =1 correspond to a (group of adjacent) wire(s); -i and =0 to an 
inverter. 








Fig. 1. Implementation of functions in category Circ 



We can do a number of things with wires and components that correspond to 
constructions of new functions in category Circ: 

— Splitting wires corresponds to the diagonal function, Z\s : B — ^ B^. 

— We can twist two wires. This corresponds to function twist : B^ — ^ B^. 

— We can put two components side by side. This corresponds to parallel com- 
position f X g : B" X B™ — ^ B*^ X B*. 

— We can put two components in a series, connecting output wires of one 
component with the input wires of another. This corresponds to composition 
yo / : B" — ^B™. 

Example. Consider a l-to -2 decoder function which is defined in category Circ 
as follows, d : B^ — ^ B^, where (x, s) 1— > (yi,yo) such that yi = a;&(=i s), 
and yo = a;&(=o s). A decomposition of function d is straightforward: first, copy 
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both inputs x and s, pair them by twisting ’’middle” elements, apply functions 
=1 and =0 to each copy of s, and finally compute & of each resulting pair: 

1 to 2 ^ ^ B ^ ^ B 4 Ifi tivi st ^ 1 . B 4 ^ ^ ^ 2 

A circuit which implements a decoder function is drawn in Fig. 2. Notice how the 
circuit corresponds exactly to the decomposition given above. Suppose now we 




Fig. 2. A l-to-2 decoder 



want to design a 2-to-4 decoder — > B^, where (x, si,so) >— >■ (ys, 2/2, yi, yo), 
such that yo = xk{=o si)&(=o sq), yi = xk{=o si)&(=i sq), y 2 = xk{=i 
si)&(=o So), and ys = xk{=i si)&(=i sq). A decomposition can use previously 
defined function d. Indeed, first compute xk{=i Si) and xk{=o Si) using a 1- 
to-2 decoder. Then pair each of the outputs with a copy of signal sq using twist 
operation, and give the resulting pairs as inputs to two identical l-to-2 decoders, 
each computing the second half of the formulae. Thus a 2-to-4 decoder function 
is decomposed into a sequence of parallel compositions (a so-called cascading 
principle) as follows: 

2 ^ . 1-1 2 y-i d X ^,B y-i 4 1 B ^ iwist X 1 B y^i 4 i^i 4 

A corresponding circuit is depicted in Fig. 3. However, it is much more tedious 
to describe and decompose into basic components, say a 4-to-16 decoder, and it 




Fig. 3. A 2-to-4 decoder 



is impossible to describe a general n — to — 2” (n > 1) decoder. In order to be 
able to manipulate with tuples of bits of any length, concatenation lists can be 
used. 
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3 Concatenation Lists 

Suppose a, 13,... are primitive data types with identity functions ida : a ^ a, 
id /3 : (3 j3,... defined for every type. If a is a type, we can form the type a*-, 

its elements are lists of elements of type a. Using the standard CDT technique, 
operations on a,j3,... can be lifted to operations on a*, j3*,... [7]. 

Cone, or join lists [5,7] have three constructors, one that makes an empty 
list, [] : unit ^ a*, another creates a singleton list, [.] : a ^ a*, and the third 
concatenates two lists to make a longer one, -ff : a* xa* ^ a*. A table below 
summarises some of the list operations. 





Combinator Functions 


distribute-left 

shift-left 

zip 


distl [xi,...,Xn] y = [{y,xi),...,{y,Xn)] 

T e [a;i,...,a;„] = [e, a;i,...,x„_i] 

zip [xi,...,Xn][yi,--,yn] = [{xi,yi),..., (Xn,yn)] 




Functionals 


map 

reduce 

directed-reduce 


/ * [ui , 02 ,---,Oj^] = [/ ^^1,/ ^2 ^n] 

/© [ai,02, ..., a„] = oi © 02 © ... © a„ 

(/ ^ © e) [ai, 02 ,...,an] = (...((e©oi) © 02 )... ©Ori) 



Operations on cone lists are known to be homomorphisms [7] . Many of these 
operations incorporate inherent parallelism and have fixed communication pat- 
terns [5,6]. Map f* is completely parallel and requires no communication. It 
can be implemented as a parallel composition of n combinational circuits, each 
circuit representing function / applied to an individual list element. Reduce can 
be evaluated in an obvious tree- like fashion. Directed reduce can be implemented 
as a pipeline, or a sequential composition of combinational circuits, each imple- 
menting © on a corresponding list element and the output of a previous circuit. 
A set of implemetation templates for list operations is given in Fig. 4. An impor- 
tant part of categorical definitions of list operations is that they come with the 
set of algebraic laws which are used in transformational program derivation [2,4]. 



3.1 n — fo — 2” Decoder 

A general n — to — 2"' decoder function is described as follows. Given an input 
signal X and a number s = sq * 2° + si * 2^ + ... + s„_i * 2"'^^ represented as a 
bit list [s„_i, s„_ 2 , •••, So], the decoder must produce 2” outputs [y 2 "-i, ■ ■ ■ ,yo], 
such that Pi = x&z{i = s), 0 < n < 2"' — 1. 

An n — to — 2^ decoder function can be represented as a directed reduce. First, 
bits X and s„_i are used by a 1 — to — 2 decoder to produce a pair of outputs, 
a;&(=i s„_i) and a;&(=o s„_i). These outputs represent partial results that are 
to be used in a next cascade, as in a 2 — to — 4 example, with the element s „_2 of 
the input list distributed among these partial results. This produces 4 new bits 
which, in their turn, have to be paired with the consecutive element s^-s of the 
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TtMPLATE POR DISTR1BUTE_LEFT FUNCTION 



COMPOSITION OP DISTRJBUTE AND V 



Fig. 4. Implementation templates for list operations 



input list, etc. We want to express this method in terms of list operations, hence 
we will be using lists to represent both, input signal s and intermediate results: 

n — to — 2"' : B * x B>it — > B>it 

n—to—2^{[x], ..., So]) = (/ ~^ (/ db o{MakeListod)*odistl)[x])[sn-i, ■■■, soj- 

Function MakeList makes a list of two elements from a pair: MakeList{x, y) = 
[x, y] or, more concisely, MakeList = -ff o ([.] opi x [.] 0 ^ 2 ), where pi and P 2 are 
projection functions. Function d is a l-to-2 decoder defined in a previous section. 
A combinational circuit for an n — to — 2n decoder can be obtained by taking 
a general template for a directed reduce (see Fig. 4) and substituting every © 
’’box” for operation (/ -ff o {MakeList o d) * odistl). In the latter composition, 
/ -ff and MakeList are ’’service” operations that only change representation of 
data; their implementation does not require separate cicuits. Hence, every © box 
in the directed reduce template is, in its turn, a composition of templates for 
distl and d* (map). A straightforward substitution of © boxes for a composition 
ofcorresponding templates for operations distl and d* results in a circuit depicted 
in Fig. 5. 

4 Discussion 

A set of basic algebras for bits, characters, short and long integers, etc., can 
easily be fully specified in terms of categories with products. These primitive 
data types reflect characteristics of the basic components from which to build 
a particular application. These basic components are realized in a straightfor- 
ward compositional way by combinational circuits and implemented by blocks 
of FPGA logic cells. 
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CIRCUIT FOR 3-TO-8 DECODER (A COMPOSITION OF TEMPLATES DIRECTED REDUCE WITH DITSL-MAP) 




Fig. 5. n — to — 2"' decoder is a composition of templates 



Concatenation lists are natural data aggregates for many applications. Lists 
can be fully defined by a fixed set of (homomorphic) operations, and for each 
operation from this set we can design one or more implementation templates 
that have desirable properties, such as regular, repetitive structures and local 
interconnections between components. 

However, as soon as we need delays, combinational circuits alone are not 
enough. To implement a delay, we need a lateh, i.e., a circuit with a feedbaek 
loop. To reason about such circuits we have to consider at any moment the state 
of the whole circuit. Assuming the synchronous model, we need functions which 
describe the general change of state. Unfortunately, we cannot do it in a cate- 
gory with products only. However, if we follow the advice given by N. Wirth [9], 
restricting circuit design to combinational circuits, and having latches and reg- 
isters as complete parts, so that feedback loops exist only within these parts, we 
can still retain much of the simplicity and expressive power of homomorphisms. 

In future, we want to extend our approach to design and analysis of any 
synchronous, sequential circuit, i.e., circuits that consist of combinational circuits 
and registers, the latter represent a state. We hope to develop a compositional 
approach, similar to the one described in this paper, within the scope of the 
distributive category, so that any state machine can be designed and analysed 
in a concise stepwise manner. 
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Abstract. Experimentation in software engineering is important but 
difficult. One reason it is so difficult is that there are a large number of 
context variables, and so creating a cohesive understanding of experimen- 
tal results requires a mechanism for motivating studies and integrating 
results. This paper argues for the necessity of a framework for organiz- 
ing sets of related studies. With such a framework, experiments can be 
viewed as part of common families of studies, rather than being isolated 
events. Common families of studies can contribute to important and rel- 
evant hypotheses that may not be suggested by individual experiments. 
A framework also facilitates building knowledge in an incremental man- 
ner through the replication of experiments within families of studies. 
Building knowledge in this way requires a community of researchers that 
can replicate studies, vary context variables, and build abstract models 
that represent the common observations about the discipline. This paper 
also presents guidelines for lab packages, meant to encourage and sup- 
port replications, that encapsulate materials, methods, and experiences 
concerning software engineering experiments. 



1 Introduction 



Experimentation in software engineering is necessary. Common wisdom, 
intuition, speculation and proofs of concepts are not reliable sources of credible 
knowledge. On the contrary, progress in any discipline involves building models 
that can be tested, through empirical study, to check whether the current under- 
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standing of the field is correct^. Progress comes when what is actually true can 
be separated from what is only believed to be true. To accomplish this, the sci- 
entific method supports the building of knowledge through an iterative process 
of model building, prediction, observation, and analysis. It requires that no con- 
fidence be placed in a theory that has not stood up to rigorous deductive testing 
[21]. That is, any scientific theory must be (1) falsifiable, (2) logically consistent, 
(3) at least as predictive as other competing theories, and (4) its predictions 
have been confirmed by observations during tests for falsification. According to 
Popper, a theory can only be shown to be false or not yet false; researchers only 
become confident in a theory when it has survived numerous attempts made at 
its falsification. This paradigm is a necessary step for ensuring that opinion or 
desire does not influence knowledge. 

Experimentation in software engineering is difficult. Carrying out em- 
pirical work is complex and time consuming; this is especially true for software 
engineering. Unlike manufacturing, we do not build the same product, over and 
over, to meet a particular set of specifications. Software is developed and each 
product is different from the last. So, software artifacts do not provide us with a 
large set of data points permitting sufficient statistical power for confirming or 
rejecting a hypothesis. Unlike physics, most of the technologies and theories in 
software engineering are human-based, and so variation in human ability tends 
to obscure experimental effects. Human factors tend to increase the costs of ex- 
perimentation while making it more difficult to achieve statistical significance. 

Abstracting conclusions from empirical studies in software engineering 
research is difficult. An important reason why experimentation in software 
engineering is so hard is that the results of almost any process depend to a 
large degree on a potentially large number of relevant context variables. Because 
of this, we cannot a priori assume that the results of any study apply outside 
the specific environment in which it was run. For isolated studies, even if they 

^ For the purpose of this paper, we use the definitions of some key terms from [15] and 
[1]. An empirical study, in a broad sense, is an act or operation for the purpose of 
discovering something unknown or of testing a hypothesis, involving an investigator 
gathering data and performing analysis to determine what the data mean. This covers 
various forms of research strategies, including all forms of experiments, qualitative 
studies, surveys, and archival analyses. An experiment is a form of empirical study 
where the researcher has control over some of the conditions in which the study takes 
place and control over the independent variables being studied; an operation carried 
out under controlled conditions in order to test a hypothesis against observation. 
This term thus includes quasi-experiments and pre-experimental designs. 

A theory is a possible explanation of some phenomenon. Any theory is made up 
of a set of hypotheses. A hypothesis is an educated guess that there exists (1) a 
(causal) relation among constructs of theoretical interest; (2) a relation between a 
construct and observable indicators (how the construct can be observed). A model 
is a simplified representation of a system or phenomenon; it may or may not be 
mathematical or even formal; it can be a theory. 
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are themselves well-run, it is difficult to understand how widely applicable the 
results are, and thus to assess the true contribution to the field. 

As an example, consider the following study: 

• Basili/Reiter. This study was undertaken in 1976 in order to characterize 
and evaluate the development processes of development teams using a dis- 
ciplined methodology. The effects of the team methodology were contrasted 
with control groups made up of development teams using an ”ad hoc” devel- 
opment strategy, and with individual developers (also ” ad hoc” ) . Hypotheses 
were proposed: that (BRl) a disciplined approach should reduce the average 
cost and complexity (faults and rework) of the process and (BR2) the disci- 
plined team should behave more like an individual than a team in terms of 
the resulting product. The study addressed these hypotheses by evaluating 
particular methods (such as chief programmer teams, top down design, and 
reviews) as they were applied in a classroom setting. [7] 

This study, like any other, required the experimenters to construct models 
of the processes studied, models of effectiveness, and models of the context in 
which the study was run. Replications that alter key attributes of these models 
are then necessary to build up knowledge about whether the results hold under 
other conditions. Unfortunately, in software engineering, too many studies tend 
to be isolated and are not replicated, either by the same researchers or by others. 
Basili/Reiter was a rigorous study, but unfortunately never led to a larger body 
of work on this subject. The specific experiment was not replicated, and the 
applicability of the hypotheses in other contexts was not studied. Thus it was 
never investigated whether the results hold, for example: 

• for software developers at different levels of experience (the original experi- 
ment used university students); 

• if development teams are composed differently (the original experiment used 
only 3-person teams); 

• if another disciplined methodology had been used (i.e., were the benefits 
observed due to the particular methodology used in the experiment, or would 
they be observed for any disciplined methodology?). 

2 A Motivating Example: Software Reading Techniques 

Yet even when replications are run, it’s hard to know how to abstract important 
knowledge without a framework for relating the studies. To illustrate, we present 
our work on reading techniques. Reading techniques are procedural techniques, 
each aimed at a specific development task, which software developers can follow 
in order to obtain the information they need to accomplish that task effectively 
[2, 3]. We were interested in studying reading techniques in order to determine if 
beneficial experience and work practices could be distilled into procedural form, 
and used effectively on real projects. We felt that reading techniques were of rel- 
evance and value to the software engineering community, since reading software 
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documents (such as requirements, design, code, etc.) is a key technical activity. 
Developers are often called upon to read software documents in order to extract 
specific information for important software tasks, e.g. to read a requirements 
document in order to find defects during an inspection, or an Object-Oriented 
design in order to identify reusable components. However, while developers are 
usually taught how to write software documents, the skills required for effecting 
reading are rarely taught and must be built up through experience. In fact, we 
felt that research into reading could provide a model for how to effectively write 
documents as well: by understanding how readers perform more effectively it 
may be possible to write documents in a way that facilitates the task. 

However, the concept of reading techniques cannot be studied in isolation. 
Like any other software process, reading techniques must be tailored to the en- 
vironment in which they are run. Our aim in this research was to generate sets 
of reading techniques that were procedurally defined, tailorable to the environ- 
ment, aimed at accomplishing a particular task, and specific to the particular 
document and notation on which they would be applied. This has led a series of 
studies in which we evaluated the following types of reading techniques: 

• Defect-Based Reading (DBR) focused on defect detection in requirements, 
where the requirements were expressed using a state machine notation called 
SCR [13, 22]. 

• Perspective-Based Reading (PBR) also focused on defect detection in re- 
quirements, but for requirements expressed in natural language [4, 16]. 

• Use-Based Reading (UBR) focused on anomaly detection in user interfaces 
[27]. 

• Second Version of PBR (PBR2) consisted of new techniques that were more 
procedurally-oriented versions of the earlier set of PBR techniques. In par- 
ticular, we made the techniques more specific in all of their steps [24]. 

• Scope-Based Reading (SBR) consisted of two reading techniques that were 
developed for learning about an Object-Oriented framework in order to reuse 
it [10, 23]. 

A framework that makes explicit the different models used in these exper- 
iments would have many benefits. Such a framework would document the key 
choices made during experimental design, along with their rationales. The frame- 
work could be used to choose a focus for future studies: i.e., help determine the 
important attributes of the models used in an experiment, and which should be 
held constant and which varied in future studies. The ultimate objective is to 
build up a unifying theory by creating a list of the specific hypotheses investi- 
gated in an area, and how similar or different they all are. 

Using an organizational framework also allows other experimenters to un- 
derstand where different choices could have been made in defining models and 
hypotheses, and raises questions as to their likely outcome. Because these frame- 
works provide a mechanism by which different studies can be compared, they 
help to organize related studies and to tease out the true effects of both the 
process being studied and the environmental variables. 
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3 The GQM Goal Template as a Tool for Experimentation 

Examples of such organizational frameworks do exist in the literature, e.g. [9, 
17, 20], For the purpose of this paper we find the Goal/Question/Metric (GQM) 
Goal Template [8] useful. The GQM method was defined as a mechanism for 
defining and interpreting a set of operational goals using measurement. It rep- 
resents a top-down systematic approach for tailoring and integrating goals with 
models of software processes, products, and quality perspectives, based upon the 
specific needs of a project and organization. 

The GQM goal template is a tool that can be used to articulate the purpose 
of any study. It ties together the important models, and provides a basis against 
which the appropriateness of a study’s specific hypotheses, and dependent and 
independent variables, may be evaluated. There are five parameters in a GQM 
goal template: 

• object of study: a process, product or any other experience model 

• purpose: to characterize (what is it?), evaluate (is it good?), predict (can 
I estimate something in the future?), control (can I manipulate events?), 
improve (can I improve events?) 

• focus: model aimed at viewing the aspect of the object of study that is of in- 
terest, e.g., reliability of the product, defect detection/prevention capability 
of the process, accuracy of the cost model 

• point of view: e.g., the perspective of the person needing the information, 
e.g., in theory testing the point of view is usually the researcher trying to 
gain some knowledge 

• context: models aimed at describing environment in which the measurement 
is taken 

For example, the goal of the Basili/Reiter study, previously described, might 
be instantiated as: 

To analyze the development processes of a 1) disciplined-methodology team 

approach, 2) ad hoc team approach, and 3) ad hoc individual approach 

for the purpose of characterization and evaluation 

with respect to cost and complexity (faults and rework) of the process 

from the point of view of the developer and project manager 

in the context of an advanced university classroom 

Due to the nature of software engineering research, instantiated goals tend to 
show certain similarities. The purpose of studies is often evaluation; that is, 
researchers tend to study software technologies in order to assess their effect 
on development. For our purposes, the point of view can be considered to be 
that of the researcher or knowledge- builder. While studies can be run from the 
point of view of the project manager, i.e. requiring some immediate feedback 
as to effects on effort and schedule, published studies have usually undergone 
additional, post-hoc analysis. 

The remaining fields in the template require the construction of more com- 
plicated models, but still show some similarities. The object of study is often (but 
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not always) a process; researchers are often concerned with evaluating whether 
or not a particular development process represents an improvement to the way 
software is built. (E.g.: Does Object-Oriented Analysis lead to an improved im- 
plementation? Does an investment in reviews lead to less buggy, more reliable 
systems? Does reuse allow quality systems to be built more cheaply?) When 
the object of study is a process, the focus of the evaluation is the process’ ef- 
fect. The experimenter may measure its effect on a product, that is, whether 
the process leads to some desired attribute in a software work product. Or, the 
experimenter may attempt to capture its effect on people, e.g. whether practi- 
tioners were comfortable executing the process or found it tedious and infeasible. 
Finally, the context field should include a large number of environmental vari- 
ables and therefore tends to exhibit the most variability. Studies may be run on 
students or experts; under time constraints, or not; in well-understood applica- 
tion domains, or in cutting-edge areas. There are numerous such variables that 
may influence the results of applying a technique. 

For the remainder of this paper, we will illustrate our conclusions by concen- 
trating on studies that investigate process characteristics with respect to their 
effects on products. A GQM template for this class of studies is: 

Analyze processes to evaluate their effectiveness on a product from the point 
of view of the knowledge builder in the context of (a particular variable set). 

For particular studies in this class, constructing a complete GQM template 
requires making explicit the process (object of study), the effect on the product 
(focus), and context models in the experiment. Making these models explicit is 
necessary in order to understand the conditions under which the experimental 
results hold. 

For example, consider the GQM templates for the list of reading technique 
experiments described in the previous section. There are many ways of classifying 
processes, but we might first classify processes by scope as: 

• Techniques (processes that can be followed to accomplish some specific task), 

• Methods^ (processes augmented with information concerning when and how 
the process should be applied), 

• Life Cycle Models (processes which describe the entire software development 
process). 

Each of these categories could be subdivided in turn. The set of techniques, 
for example, could be classified based on the specific task as: Reading, Testing, 
Designing, and so on. We have found it helpful to think of the range of values 
as organized in a hierarchical fashion, in which more general values are found at 
the top of the tree, and each level of the tree represents a new level of detail. 
(Figure 1) 

Selecting a particular type of process for study, our GQM template then 
becomes: 

Analyze reading techniques to evaluate their effectiveness on a product from 
the point of view of the knowledge builder in the context of a particular 
variable set 
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Life Cycle Model Method Technique 




Fig. 1. A portion of the hierarchy of possible values for describing software 
processes 



The reading technique experiments were concerned with studying the effect of 
the reading technique on a product. So, the model of focus needs to specify both 
how effectiveness is to be measured and the product on which the evaluation 
is performed. We find it useful to divide the set of effectiveness measures into 
analysis and construction measures, based on whether the goal of the process 
is to analyze intrinsic properties of a document or to use it in building a new 
system. Each of these categories can be further broken down into more specific 
types of process goals, for which different effectiveness measures may apply (Fig. 
2). For example, the effectiveness of a process for performing maintenance can 
be evaluated by how that process effects the cost of making a change to the 
system. The effectiveness of a process for detecting defects in a document can be 
measured by the number of faults it helps find. Of course, many more measures 
exist than will fit into Figure 2. For instance, rather than measure the number of 
faults a defect detection process yields, it might be more appropriate to measure 
the number of errors^, or the amount of effort required, among other things. 

Similarly, a software document can be classified according to the model of 
a software system it contains (a relatively well-defined set) and further subdi- 
vided into the specific notations that may be used (Fig. 3). The main purpose of 
organizing the possible values hierarchically is to organize a conception of the 
problem space that can be used by others for classifying their own experiments. 
The actual criteria used are somewhat subjective; naturally there are multiple 
criteria for classifying processes, effectiveness measures, and software documents, 
but we have selected just those that have contributed to our conception of read- 
ing techniques. 

Thus a GQM template for the PBR experiment could be: 

Analyze reading techniques to evaluate their ability to detect defects in a 
Requirements Doeument written in English from the point of view of the 
knowledge builder in the context of a particular variable set. 

^ The definitions of ’’technique” and ’’method” are adapted from [5]. 

® Here we are using the terms ’’faults” and ’’errors” according to the IEEE standard 
definitions [14], in which ’’fault” refers to defects appearing in some artifact while 
’’error” refers to an underlying human misconception that may be translated into 
faults. 
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Fig. 2. A portion of the hierarchy of possible values for describing the effective- 
ness of software processes 



Document 




Requirements Design Code 




Fig. 3. A portion of the hierarchy of possible values for describing software 
documents 

A GQM goal is not meant to be a definitive description, but reflects the in- 
terests and priorities of the experimenter. If we were to study the process model 
for the reading techniques in each experiment in more detail, we would see that 
each technique is tailored to a specific task (e.g., analysis or construction, etc.) 
and to a specific document. This is what characterizes the reading techniques 
and distinguishes them from one another. Thus the process goals used to classify 
measures of effectiveness in Figure 2 can be easily adapted to describe the pro- 
cesses themselves (Figure 4). The distinction between analysis and construction 
process goals can apply directly to processes. That is, we hypothesize that anal- 
ysis tasks differ sufficiently from construction tasks that, along with differences 
in the way they may be evaluated for effectiveness, there may also be different 
guidelines used in their construction. Thus figures 2 and 3 can also be mecha- 
nisms for identifying process model attributes. They should be accounted for in 
the process model as well as the effect on process. 

Thus we can say that we are: 

analyzing a reading technique for the purpose of evaluating its ability to 
detect defects in a natural language requirements document 
or we can say that we are: 

analyzing a reading technique tailored to defect detection in natural language 
requirements for the purpose of evaluation. 
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Process Goal 




Defect Usability . . . Reuse Maintenance 

Detection 



Fig. 4. A portion of the hierarchy of possible values for describing the goal of a 
software engineering process 



It depends on whether we are emphasizing the definition of the process or of 
its effectiveness. 

In linking goal templates to hypotheses, we can think of the process model 
(object of study) as the independent variable, the effect on product (focus) as the 
dependent variable, and the context variables as the variables that exist in the 
environment of the experiment. The differences or similarities between experi- 
mental hypotheses can then be described in terms of these hierarchies of possible 
values. For example, consider the studies of DBR and PBR. In both cases, the 
process model was focused on the same task (defect detection); although the no- 
tation differed, both were also focused on the same document (requirements). If 
all other attributes for process, product, and context models were held constant, 
we could begin to think of hypotheses at a higher level of abstraction. That is, 
instead of the hypothesis: 

Subjects using a reading technique tailored to defect detec- 
tion in natural language requirements are more effective than 
subjects using ad hoc techniqnes for this task 

The following hypothesis might be more useful: 

Subjects using reading techniques tailored to defect detection 
in requirements are more effective than subjects using ad hoc 
techniques for this task. 

The difference between these hypotheses is that the focus of the study is de- 
scribed at a higher level of abstraction for the second hypothesis (requirements) 
than for the first (natural language requirements). 

This difference in abstraction makes the second hypothesis more difficult to 
test. In fact, probably no single study could ever give us overwhelming evidence 
as to its validity, or lack thereof. Testing the second hypothesis would require 
some idea of what types of requirements notation are of interest to practition- 
ers. Building up a convincing body of evidence requires the combined analysis 
of multiple studies of specific reading techniques for defect detection in require- 
ments. But the effort required to formulate the hypothesis and begin building 
a body of evidence helps advance the field of software engineering. At best, the 
evidence can lead to the growth of a body of knowledge, containing basic and im- 
portant theories underlying some aspect of the field. At worst, the effort spent in 
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specifying the models forces us to think more deeply about the relevant ways of 
characterizing software engineering models that we, as researchers, are implicitly 
constructing anyway. 

The above discussion should not be taken to imply that the attributes iden- 
tified in Figures 1 through 4 are the only ones that are important, or for which 
hierarchies of possible values exist. To choose another example, in specifying the 
model of the context it is almost always important to characterize the experience 
of the subjects of the experiment. The most appropriate way of characterizing 
experience depends on many things; two possibilities are proposed in Figure 5. 



Experience 




Students Professionals 



Experience 




Never used 


Learned 


Applied 


Applied 


Applied 


process 


process in a 


process on 


process on 2- 


process on >3 


before 


class 


one project 


3 projects 


projects 



Fig. 5. Two possible value hierarchies for measuring subject experience 



The trees shown in Figure 5 present two different ways of characterizing expe- 
rience. The first is a simpler way of characterizing the attribute that distinguishes 
only between subjects who are still learning software engineering principles ver- 
sus those who have applied them on real projects. The second hierarchy attempts 
to place finer distinctions on the amount of experience a subject has applying a 
particular process. Each may be appropriate to different circumstances. 

4 Replicating Experiments 

In preceding sections of this paper, we have tried to raise several reasons why 
families of replicated experiments are necessary for building up bodies of knowl- 
edge about hypotheses. Another reason for running replications is that they can 
increase the amount of confidence in results by addressing certain threats to 
validity: Internal validity defines the degree of confidence in a cause-effect re- 
lationship between factors of interest and the observed results, while external 
validity defines the extent to which the conclusions from the experimental con- 
text can be generalized to the context specified in the research hypothesis [11]. 
In this section, we discuss replications in more detail and look at the practical 
considerations that result. 
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Our primary strategy for supporting replications in practice has been the 
creation of lab packages, which collect information on an experiment such as 
the experimental design, the artifacts and processes used in the experiment, the 
methods used during the experimental analysis, and the motivation behind the 
key design decisions. Our hope has been that the existence of such packages 
would simplify the process of replicating an experiment and hence encourage 
more replications in the discipline. Several replications have been carried out in 
this manner and have contributed to a growing body of knowledge on reading 
techniques. 



4.1 Types of Replications 

Since we consider that replications may be undertaken for various reasons, we 
have found it useful to enumerate the various reasons, each of which has its own 
requirements for the lab package. In our view the types of replications that need 
to be supported can be grouped into 3 major categories: 

1. Replications that do not V 2 iry any research hypothesis. Replications 
of this type vary none of the dependent or independent variables of the 
original experiment. 

1.1. Strict replications (i.e. replications that duplicate as accurately as 
possible the original experiment). These replications are necessary to 
increase confidence in the validity of the experiment. They demonstrate 
that the results from the original experiment are repeatable, and have 
been reported accurately by the original experimenters. 

1.2. Replications that vary the manner in which the experiment is 
run. These studies seek to increase our confidence in experimental results 
by addressing the same problem as previous experiments, but altering 
the details of the experiment so that certain internal threats to validity 
are addressed. For example, a replication may vary the order of activities 
to avoid the possibility that results depend not on the process used, but 
on the order in which activities in the experiment are completed. 

The attempt to compensate for threats to internal validity may also lead 
to other types of changes. For example, a process may be modified so that 
the researchers can assess the amount of process conformance of subjects. 
Although the aim of the change may have been to address internal valid- 
ity, the new process should be evaluated in order to understand whether 
unanticipated effects on process effectiveness have resulted. Thus such a 
replication would fall into the second major category, discussed below. 

2. Replications that vary the research hypotheses. Replications of this 
type vary attributes of the process, product, and context models but remain 
at the same level of specificity as the original experiment. 

2.1. Replications that vary variables intrinsic to the object of stndy 
( i.e. independent variables). These replications investigate what as- 
pects of the process are important by systematically varying intrinsic 
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properties of the process and examining the results. This type of experi- 
ment requires the process to be supplied in sufficient detail that changes 
can be made. This implies that the original experimenters must provide 
the rationales for the design decisions made as well as the finished prod- 
uct. For example, researchers may question whether the specificity at 
which the process is described affects the results of applying the process. 
In this sense, the study of PBR2 may be seen as a replication of the 
study of PBR, in which the level of specificity of the process was varied 
but all other attributes of the process model remained the same. 

2.2. Replications that vary variables intrinsic to the focus of the 
evaluation (i.e. dependent variables). Replications of this type may 
vary the ways in which effectiveness is measured, in order to understand 
for what dimensions of a task a process results in the most gain. For 
example, a replication might choose another effectiveness measure from 
those listed in Figure 2, investigating whether a defect detection process 
is more beneficial for finding errors than faults. Other aspects of the 
focus model might be varied instead, e.g. a process might be evaluated 
on a document of the same type but different notation to see if it is 
equally effective (see Figure 3). 

2.3. Replications that vary context variables in the environment 
in which the solution is evaluated. These studies can identify po- 
tentially important environmental factors that affect the results of the 
process under investigation and thus help understand its external valid- 
ity. For example, replications may be run using the same process and 
product models as the original experiment but on professionals instead 
of students (see Figure 5) to see if the same results are obtained. 

3. Replications that extend the theory. These replications help determine 
the limits to the effectiveness of a process, by making large changes to the 
process, product, and/or context models to see if basic principles still hold. 
We discussed replications in the previous category as replacing the value 
of some variable (e.g. document on which the process was applied. Figure 
3) with another, equally specific value (e.g. SCR requirements instead of 
English- language requirements). Replications in this category, however, can 
be thought of as replacing an attribute of a process, product, or context 
model with a value at a higher level of abstraction (i.e. from a higher level 
in the hierarchy). Again using Figure 3, researchers may choose to study 
whether a type of process is applicable to requirements documents in general, 
rather than limiting their scope to a specific kind. The type of hypotheses 
associated with such replications was discussed in section 3. 



4.2 Implications for Lab Package Design 

In software engineering research, there has been a movement toward the reuse 
of physical artifacts and concrete processes between experiments. This is indeed 
a useful beginning. The cost of an experiment is greatly increased if the prepa- 
ration of multiple artifacts is necessary. Creating artifacts which are representa- 




Using Experiments to Build a Body of Knowledge 277 



five of those used in real development projects is difficult and time consuming. 
Reusing artifacts can thus reduce the time and cost needed for experimenta- 
tion. A more significant benefit is that reuse allows the opportunity to build up 
knowledge about the actual use of particular, non-trivial artifacts in practice. 
Thus replications (and experimentation in general) could be facilitated if there 
were repositories of reusable artifacts of different types (e.g. requirements) which 
have a history of reuse and which, therefore, are well understood. (A model for 
such repositories could be the repository of system architectures [12], where the 
relevant attributes of each design in the repository are known and described.) 

A first step towards this goal is the construction of web-based laboratory 
packages. At the most basic level, these packages allow an independent experi- 
menter to download experimental materials, either for reuse or for better under- 
standing. In this way, these packages support strict replications (as defined in 
section 4.1), which require that the processes and artifacts used in the original 
experiment be made available to independent researchers. 

However, web-based lab packages should be designed to support more sophis- 
ticated types of replications as well. For example, packages should assist other 
experimenters in understanding and addressing the threats to validity in order 
to support replications that vary some aspects of the experimental setup. Due 
to the constraints imposed by the setting in which software engineering research 
is conducted, it is almost never possible to rule out every single threat to valid- 
ity. Choosing the ’’least bad” set of threats given the goal of the experiment is 
necessary. Lab packages need to acknowledge this fact and make the analysis of 
the constraints and the threats to validity explicit, so that other studies may use 
different experimental designs (that may have other threats to validity of their 
own) to rule out these threats. 

Replications that seek to vary the detailed hypotheses have additional re- 
quirements if the lab package is to support them as well. For example, in order 
for other experimenters to effectively vary attributes of the object of study, the 
original process must be explained in sufficient detail that other researchers can 
draw their own conclusions about key variables. Since it is unreasonable to ex- 
pect the original experimenters to determine all of the key variables a priori, lab 
packages must provide rationales for key experimental context decisions so that 
other experimentalists can determine feasible points of variation of interest to 
themselves. Similarly, lab packages must specify context variables in sufficient 
detail that feasible changes to the environment can be identified and hypotheses 
made about their effects on the results. 

Finally, in order to build up a body of knowledge about software engineering 
theories, researchers should know which experiments have been run that offer 
related results. Therefore, lab packages for related experiments should be linked, 
in order to collect different experiments that address different areas of the prob- 
lem space, and contribute evidence relevant to basic theories. The web is an ideal 
medium for such packages since links can be added dynamically, pointing to new, 
related lab packages as they become available. Thus it is to be hoped that lab 
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packages are ’’living documents” that are changed and updated to reflect our 
current understanding of the experiments they describe. 

Lab packages have been our preferred method for facilitating the abstraction 
of results and experiences from series of well-designed studies. Interested readers 
are referred to existing examples of lab packages: [25, 26]. By collecting detailed 
information and results on specific experiments, they summarize our knowledge 
about specific processes. They record the design and analysis methods used and 
may suggest new ones. Additionally, by linking related studies they can help 
experimenters understand what factors do or do not impact effectiveness. 

4.3 The Experimental Community 

A group of researchers, from both industry and academia, has been organized 
since 1993 for the purpose of facilitating the replication of experiments. The 
group is called ISERN, the International Software Engineering Research Net- 
work, and includes members in North America, Europe, Asia, and Australia. 
ISERN members publish common technical reports, exchange visitors, and or- 
ganize annual meetings to share experiences on software engineering experimen- 
tation^. They have begun replicating experiments to better understanding the 
success factors of inspection and reading. 

The Empirical Software Engineering journal has also helped build an experi- 
mental community by providing a forum for publishing descriptions of empirical 
studies and their replications. An especially noteworthy aspect of the journal is 
that it is open to publishing replicated studies that, while rigorously planned and 
analyzed, yield unexpected results that did not confirm the original study. Al- 
though it has traditionally been difficult to publish such ’’unsuccessful” studies in 
the software engineering literature, this knowledge must be made available if the 
community is to build a complete and unbiased body of knowledge concerning 
software technologies. 

5 Conclusions 

The above discussion leads us to propose that the following criteria are necessary 
before we can begin to build up comprehensive bodies of knowledge in areas of 
software engineering: 

1. Hypotheses that are of interest to the software engineering community and 
are written in a context that allow for a well defined experiment; 

2. Context variables, suggested by the hypotheses, that can be changed to allow 
for variation of the experimental design (to make up for validity threats) and 
the context of experimentation; 

3. A sufficient amount of information so that the experiment can be replicated 
and built upon; and 

^ More information is available at the URL 
http: / /wwwagse. informatik.uni-kl.de/ISERN/isern.html 
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4. A community of researchers that understand experimentation, the need for 
replication, and are willing to collaborate and replicate. 

With respect to the Basili/Reiter study introduced in seetion 1, we can note 
that while it satisfied criteria 1 and 3, it failed with respect to criteria 2 and 
4. It was not suggested by the authors that other researehers might vary the 
design or manipulate the processes or eriteria used for evaluation (although the 
analysis of the data was varied in a later study [6]). Nor was there a community 
of researchers willing to analyze the hypotheses even if suggestions for replieation 
had been made. 

In eontrast, the set of experiments on reading, discussed in a working group 
at the 1997 annual meeting of ISERN [18], is an example that we have built up a 
body of knowledge by independent researchers working on different parts of the 
problem and exposing their conclusions to different plausible rival hypotheses. 
We have shown in this paper that experimental constraints in software engi- 
neering research make it very difficult, and even impossible, to design a perfect 
single study. In order to rule out the threats to validity, it is more realistic to rely 
on the ’’parsimony” eoneept rather than being frustrated because of trying to 
completely remove them. This appeal to parsimony is based on the assumption 
that the evidence for an experimental effect is more credible if that effect can be 
observed in numerous and independent experiments each with different threats 
to validity [11]. 

A second conclusion is that empirical research must be a collaborative activ- 
ity because of the huge number of problems, variables, and issues to consider. 
This complexity can be faced with extensive brainstorming, carefully designing 
complementary studies that provide coverage of the problem and solution space, 
and reciprocal verification. 

It is our contention that interesting and relevant hypotheses can be identified 
and investigated effectively if empirical work is organized in the form of families 
of related experiments. In this paper, we have raised several reasons why such 
families are necessary: 

• To investigate the effects of alternative values for important attributes of the 
experimental models; 

• To vary the strategy with which detailed hypotheses are investigated; 

• To make up for certain threats to validity that often arise in realistically 
designed experiments. 

Discussion within the experimental community is also needed to address other 
issues, such as what constitutes an ’’acceptable” level of confidence in the hy- 
potheses that we address as a community. By running carefully designed replica- 
tions, we can address threats to validity in specific experiments and accumulate 
evidence about hypotheses. However, we are unaware of any useful and specific 
guidelines that concern the amount of evidence that must be accumulated before 
conclusions can confidently be drawn from a set of related experiments, in spite 
of the existence of specific threats. More discussion within the empirical soft- 
ware engineering community as to what constitutes a sufficient body of credible 
knowledge would be of benefit. 
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Building up a body of knowledge from families of experiments has the fol- 
lowing benefits for the software engineering researcher: 

• It allows the results of several experiments to be combined in order to build 
up our knowledge about software processes. 

• It increases the effectiveness of individual experiments, which can now con- 
tribute to answering more general and abstract hypotheses. 

• It offers a framework for building relevant practical software engineering 
knowledge, organized around the GQM goal template or another framework 
from the literature. 

• It provides a way to develop and integrate laboratory manuals, which can 
facilitate and encourage the types of replications that are necessary to expand 
our knowledge of basic principles. 

• It helps generate a community of experimenters, who understand the value 
of, and can carry out, the needed replications. 

The ability to carry out families of replications has the following benefits for 
the software engineering practitioner: 

• It offers some relevant practical SE knowledge; fully parameterizing process, 
product, and context models allows a better understanding of the environ- 
ment in which the experimental results hold. 

• It provides a better basis for making judgements about selecting process, 
since practitioners can match their development context to the ones under 
which the processes are evaluated. 

• It shows the importance of and ability to tailor ’’best practices”, that is, it 
shows how software processes can be altered by meaningful manipulation of 
key variables. 

• It provides support for defining and documenting processes, since running 
related experiments assists in determining the important process variables. 

• It allows organizations to integrate their experiences by making explicit the 
ways in which experiences differ (i.e. what the relevant process, product, 
and context models are) or are similar, and allowing the abstraction of basic 
principles from this information. 
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Abstract. In this paper we study some natural problems related to 
specifying sets of words and trees by patterns. 



1 Introduction 

Patterns are probably the most simple and natural way to specify non-trivial 
families of combinatorial structures. Abstractly, let Q he a, class of combinatorial 
structures with a substructure relation (such as graphs, trees, strings, etc.). 
Usually, given Q we can define in a natural way a notion of pattern, interpreted 
as an under- specified structure of Q, that is a structure with some “unspecified 
parts” . A pattern defines a set of instances which are structures obtained by 
instantiating the pattern’s unspecified parts by other structures. For example, 
in case of graph structures, patterns could be defined as graphs with some “meta- 
nodes” which can be instantiated by other graphs. 

Using these informal definitions, we now introduce central notions of this 
paper. For a set S of patterns, we denote by Inst{S) the set of structures which 
are instances of patterns of S. By Cont{S) we denote the set of structures which 
have a substructure in Inst{S). In the above example of graphs, if S' is a set of 
patterns (graphs with “meta-nodes”), Inst{S) is the set of instances of patterns 
from S and Cont{S) could be defined as the set of graphs having a subgraph 
that is an instance of a pattern of S. We will also study the complements of 
sets Inst{S) and Cont{S), defined by Inst{S) = G \ Inst{S) and Cont{S) = 
g \ Cont{S). 

In this paper we consider two structures, which are probably the most widely 
used data structures in computer science: words and trees. We will define the 
notion of pattern for each of these structures and we will compare the complexity 
of different natural problems related to patterns in the cases of words and trees. 
In this perspective, we survey various known results and give several new ones. 

2 Words, Trees, and Patterns 

Let us start with basic definitions. Given a finite alphabet A of letters, words 
over A are defined in the usual way as finite sequences of letters. A* stands 
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for the set of words over A. From algebraic point of view, words over A are 
elements of the free monoid generated by A. Word patterns over A are defined 
as words over alphabet AU X, where X is an infinite alphabet of variables. For 
example, v = abaababaabaab is a word over the alphabet {a, 6} and assuming 
that x,y G X, abaxbayb,xabax,xaxxaxax are patterns over {a,b}. A variable 
occurring more than once in the pattern is called non-linear, otherwise it is linear. 
A subword of a word is a fragment of its letter sequence. For example, baab is a 
subword of v and abba is not. A substitution is a morphism a : {AU X)* A* 
such that a(a) = a for all a e A. A substitution is non-erasing if a{x) 7^ e, where 
e is the empty word, and erasing otherwise. A word w E A* is an instanee of a 
pattern p E {A X)* A w = a{p) for some substitution a. In this case we say 
also that p matehes w. A substitution can be simply seen as a mapping replacing 
variable occurrences in the pattern by words such that the occurrences of the 
same variable are replaced by the same word. For example, the word v is an 
instance of each of the three patterns above. 

A tree is a well-formed expression over a signature X of function symbols, 
where each symbol is indexed by an integer number, called its arity. For example, 
u = f{f{f{a,a),h{a)),h{a)) is a tree over the signature A = {f,h,a}, where 
symbols /, h, a have arity 2,1,0 respectively. The set of trees over X is denoted 
by T{X). From algebraic point of view, T{X) is a free A-algebra generated by X. 
Thus, we are dealing with node-labeled trees representing first-order terms over 
a given signature. We will use the words tree and term interchangeably. Clearly, 
we assume that the signature contains at least one 0-arity (constant) symbol, 
otherwise the set of terms is empty. A tree pattern is a tree over XiJX, where X is 
an infinite set of 0-arity symbols of variables. Thus, f{x, h{y)), f{f{f{y, a), x), x) 
are tree patterns over {/, h, a}, where x, y are variables. A subtree of a tree t is a 
subexpression of t. In other words, a subtree of t is a tree occurring at some node 
of t. The subtrees of u are f{f{f{a, a), h{a)), h{a)), f{f{a, a), h{a)), f{a, a), h{a) 
and a. Note that h{a) and a have several occurrences in u. A substitution is a 
homomorphism a : T{X U X) T{X) such that ofa) = a for each constant a 
from X. Again, if t = a{p) for some term t, pattern p and substitution a, then 
t is said to be an instanee of p, and p is said to mateh t. Similar to words, a 
substitution replaces variables in patterns by trees such that the same variable is 
replaced by the same term. For example, term u is an instance of both patterns 
f{x, h{y)) and f{f{f{y, a), x), x), but is not an instance of f{f{x, x), h{a)). 

Note that words can be represented as trees in at least two ways. One way is 
to map each letter to a distinct unary symbol, and to add to the signature one 
constant symbol. Then a word can be naturally represented by a non-branching 
tree. However, to represent a pattern consistently, we need to introduce variables 
at internal nodes (second-order variables) which does not fit to our framework. 
Another way is to map each letter to a corresponding constant symbol and use 
one additional binary symbol for concatenation. In this case, however, one word 
is represented by several trees, due to the associativity property of concatenation. 
In general, words can be seen as trees over one associative function symbol. We 
will see in this paper that this associativity property makes many problems on 
words much more difficult than their counterparts for trees. 
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3 Problems 

We now state the problems we will address in this paper. We assume that we are 
given a set S of word (resp. tree) patterns. As defined in Introduction, Inst{S) 
denotes the set of word (resp. tree) instances of patterns of S, and Cont{S) 
denotes the set of words (trees) having respectively a subword (subtree) that is 
an instance of a pattern from S. If S consists of a single pattern p, we will write 
Inst{p) and Cont{p) as a short-hand for Inst{{p}) and Cont{{p}). 

We are interested in the following problems for both words and trees. Below 
u is a word (resp. tree), p is a word (resp. tree) pattern, and S' is a set of patterns. 

Pl.l u G Inst{p)l 
PI. 2 u € Cont{p )? 

P2.1 is Inst{S) a finite set? 

P2.2 is Inst{S) a regular set? 

P3 Instip) C Inst{S)? 

P4 Inst{p) C Cont{S)? 

P5.1 is Cont{S) a finite set? 

P5.2 is Cont{S) a regular set? 

These questions are standard language- theoretic problems. Pl.l and PI. 2 are 
membership problems for Inst- and Cont-languages. Since Inst{S) and Cont{S) 
are generally infinite, it makes sense to ask if these sets are co-finite. This jus- 
tifies problems P2.1 and P5.1. Problems P2.2 and P5.2 ask whether Inst{S) 
(respectively Cont{S)) is a regular set of words (trees). If the notion of regular 
word set (language) is well-known, the notion of regular tree language is proba- 
bly less standard. For readers who are not familiar with regular tree languages, 
we refer to books [GS84, NP92]. Finally, problems P3 and P4 are also usual 
language inclusion questions, as Inst{S) = Upg 5 /nst(p), and Inst{S) C L iff for 
all p G (S', Inst{p) C L. 

4 The Tree Case 

We now start with the tree case and survey what is known here about the 
questions above. This will motivate our study and will allow to compare these 
results with their counterparts for the word case. 

Pl.l is a trivial problem for the tree case. It asks whether a term is an 
instance of a tree pattern, which can be easily done in linear time. It is sufficient 
to check if the pattern coincides with the term at all non-variable positions, 
and check that the subterms of the term corresponding to distinct occurrences 
of the same variable in the pattern coincide. Clearly, this can be done in time 
0{\u\ + IpI). 

PI. 2 is the subterm matching problem which has numerous applications in 
functional and logic programming, automated deduction, term rewriting and 
other areas related to symbolic computation. The problem consists of testing 
whether a given pattern occurs in a given tree, that is matches one of its subtrees. 
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Usually, one wants also an algorithm to find all such subtrees, and not only to test 
if there is one. The restricted version of this problem, when the pattern contains 
only linear variables, is known under the name tree matching. In early 80’s, a 
simple practical solution has been proposed [H082]. More recently, a series of 
work has been done to find the most efficient (in the worst-case) algorithm for 
tree matching. We refer to the latest achievement [CHI99] which proposes an 
O(nlog^n) deterministic algorithm, where n is the size of the tree (assumed to 
be bigger than the size of the pattern). The algorithm (as well as previously 
proposed theoretically efficient algorithms) is however rather complicated and 
difficult to implement, and the problem of designing an efficient and practical 
tree matching algorithm is still on the agenda. Now, if a pattern contains non- 
linear variables, we can preprocess the subject tree by indexing its nodes in such 
a way that if the subtrees rooted in two nodes are the same, then these nodes 
have the same index. This preprocessing can be done in linear time (under the 
assumption that the signature has a constant size) by a bottom-up traversal 
of the tree. Then we can “forget” about repeated variables in the pattern and 
consider all variable nodes to be labeled by distinct variables. We then run a 
tree matching algorithm for linear patterns, and check, each time we find an 
occurrence of the linear pattern, if the subterms corresponding to occurrences of 
the same variable in the original pattern are equal (by looking at their indexes). 
This comparison takes time proportional to the maximal number of occurrences 
in the original pattern (0(|p|) in the worst case), which introduces a \p\ factor 
with respect to the theoretic complexity of linear pattern matching. We refer to 
[RR92] for a detailed algorithm of subterm matching in presence of non-linear 
variables. 

Let us now turn to problem P2.1, and consider a generalization of it. Instead 
of asking whether Inst{S) is finite, we ask if Inst{S) can be itself represented as 
Inst(S') for some finite set of patterns S'. Such a set S' is called a complement 
representation of S [KP98]. Again, non-linear variables in patterns of S play 
an important role. Consider the set S = {h{x), f{h{x),y)} over the signature 
{/, h, a} as above. Then the set S' = {a, /(a, x), f{f{x, y),z)} is a complement 
representation of S. One can generalize this and prove that if all patterns in the 
set are linear, a finite complement representation of this set can be constructed. 
However, one can prove that the set S = {f{x,x)} does not have a finite com- 
plement representation. The exhaustive analysis of the situation has been given 
in [LM87]. The main result can be stated as follows. 

Theorem 1 ([LM87]). A set of patterns S has a finite complement representa- 
tion iff there exists a set o/linear patterns Sim such that Inst{S) = Inst{Sim). 
Moreover, 

— if such a set Sim exists, it can be obtained by instantiating the non-linear 

variables in the patterns of S by terms, 

— the property of having a finite complement representation is decidable. 
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Let us illustrate Theorem 1 by an example. Consider the set S = 
{a, f{x, h{y)), f{x,x), f{x,f{y,z))}, still over the signature {f,h,a}. This set 
contains a non-linear term f(x,x). However, a simple analysis shows that 
f{x,x) can be replaced by f{a,a) without changing the set of instances. Thus, 
Inst{S) = Inst{Siin), where Sun is a set of linear patterns obtained from 
S by substituting a to a; in the term f{x,x). Furthermore, as Sun contains 
only linear patterns, a complement representation of Sim can be constructed: 
S'nn = {h{x), f{h{x),a), f{f{x,y),a)}. Theorem 1 asserts that this example is 
typical: if a finite complement exists, the set is “linearizable” , that is non-linear 
variables can be replaced by terms without changing the set of instances. The 
decidability of this property, stated in Theorem 1, means that a bound on the 
size of terms replacing non-linear variables can be effectively computed. 

Recently, the study of finite complement representations has received a new 
impulse [GP99, Pic99], motivated by its applications in different areas, and in 
particular in logic programming. In [Pic99], it has been proved that testing if 
a given set has a finite complement representation (see Theorem 1) is co-NP- 
complete. 

Coming back to problem P2.1, to check if Inst{S) is finite, we first check, ac- 
cording to Theorem 1, if S' has a finite complement representation. If the answer 
is positive, we compute such a representation. If all patterns in the representa- 
tion are terms (i.e. do not contain variables), then Inst{S) is finite. Otherwise, 
if at least one pattern has a variable, Inst{S) is infinite. This shows that P2.1 
is in co-NP. The NP-hardness of P2.1 follows from [KNRZ91], where it was 
proved that deciding if Inst{S) = 0, is co-NP-complete. An easy modification 
of the hardness part of this proof shows that P2.1 is co-NP-hard, and therefore 
co-NP-complete. 

Theorem 1 gives actually an answer to problem P2.2 too. It is an easy exer- 
cise to prove that if a set S contains only linear patterns, Inst{S) is a regular 
tree language [GS84, NP92]. Thus, when a set is “linearizable” in the sense of 
Theorem 1, the set of instances is regular. On the other hand, if a set is not 
linearizable, it can be shown using a pumping lemma argument that the set of 
instances is not regular. This is however not easy to prove, but follows from the 
work [Kuc91] that we will survey below. We summarize the discussion in the 
following statement. 

Proposition 1. In the tree case, P2.1 and P2.2 are co-NP-eomplete problems. 

Now let us skip problem P3 for a moment and turn to problem P4 which 
has now a more-than- ten-years history. The problem, known under the name of 
ground reducibility problem, has attracted a lot of attention in the area of term 
rewriting [DJ90] because of its application to automated inductive proofs [JK89]. 
The problem consists of testing if all instances of a given tree pattern p have a 
subtree matched by one of the patterns of a given set S. Once again, non-linear 
variables in patterns of S make the problem much more difficult. In the middle 
and late 80’s, several authors observed that the problem is decidable if patterns 
of S only contain linear variables. The problem was first proved decidable in 
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the general case by Plaisted [Pla85], and later by other authors independently 
[KNZ87, Com88]. Recently, the problem was shown to be EXPTIME-complete 
[CJ97]. 

Problem P3 can be expressed in terms of P4 in the following way. Assume 
we have a pattern p and a set of patterns S, and we want to test whether 
Inst{p) C Inst{S). Pirst delete from S those patterns which do not have the 
same root symbol as the root symbol of p (obviously, these patterns cover no 
instance of p). Then choose a new symbol a and replace the root symbol in p and 
in all remaining patterns in S by a. Let p' and S' be the resulting pattern and 
set respectively. It can be shown that Inst{p) C Inst{S) iff Inst{p') C Cont{S'). 
The latter property, which is a special instance of ground reducibility, can be 
expressed as the so called sufficient completeness property for specifications with 
free constructors (see [KNRZ91]). Deciding this property has been proved co- 
NP-complete in [KNRZ91]. 

Proposition 2. In the tree case, PS and Pf are both decidable problems. PS is 
co-NP-complete and Pf is EXPTIME-complete. 

Pinally, let us turn to problems 5.1 and 5.2. Problem 5.1 has been proved 
decidable in [Pla85, KNZ87]. Concerning Problem 5.2, the following Theorem 
has been proved in [Kuc91]. 

Theorem 2. For a set of patterns S, Cont{S) is a regular tree language iff there 
exists a set o/ linear patterns Sun such that Cont{S) = Cont{Sun). Moreover, 

— if such a set Sun exists, it can be obtained by instantiating the non-linear 

variables in the patterns of S by terms. 

Theorem 2 is a lifting of Theorem 1 from the set of instances Inst{S) to 
the set Cont{S) of terms containing instances of S as subterms. The latter 
case is however much more difficult, and the proof of Theorem 2 used a non- 
constructive combinatorial argument, based on Ramsey Theorem. Therefore, no 
effective bound on the size of terms to be substituted for the non-linear vari- 
ables, resulted from the proof, and the decidability of the regularity of Cont{S) 
remained an open problem. This problem, considered important in the area 
of rewriting, has appeared in the list of major open problems in rewriting in 
[DJK91]. Soon after, the regularity of Cont{S) has been proved decidable by 
three groups of authors [KT92, VG92, HH92]. The results of [KT95] provided 
also a new proof of the decidability of problem 5.1, and even gave an effective 
bound on the size of Cont{S) in the case it is finite. We then conclude this 
section with the following 

Proposition 3. In the tree case, P5.1 and P5.2 are both decidable problems. 

5 The Word Case 

The overview of the tree case given in the previous section shows that all the 
problems are decidable, though the complexity of some of them appears to be 
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high. In this section we study these problems in the word case and see that most 
of them, and even some restricted versions of them, turn out to be undecidable. 
We also analyze the complexity of these problems in the case of linear patterns. 

We first remark that in the tree case, Cont{S) is a “meta-notion” with re- 
spect to Inst{S), due to the fact that the notion of subtree cannot be expressed 
by means of patterns, as only first-order variables are allowed in patterns. In 
contrast, in the word case Cont{S) can be expressed in terms of Inst{S): 

Cont{S) = Inst{{xpy,xp,py,p\p G S and x,y do not occur in p}) 

This implies that, in contrast to the tree case, the problem for Cont{S) is 
simpler than its counterpart for Inst{S). In particular, if a problem is decidable 
for Inst{S), it is also decidable for Cont{S). On the other hand, if a problem is 
undecidable for Inst{S), the undecidability of its counterpart for Cont{S) may 
be harder to prove. We will face this situation later in this section. 

Note another difference with the tree case: in contrast to trees, we may allow 
variables in word patterns to be substituted by the empty word. This gives rise to 
two cases depending of whether this possibility is allowed or not. Following Kari 
et al. [KMPS95], we call these cases erasing {E-case for short), if substituting by 
the empty word is allowed, and non-erasing {NE-case), if it is not allowed. We 
will generally speak about the NE-case, unless the E-case is explicitly mentioned. 

An early result of Angluin [AngSO] asserts that problem Pl.l is NP-complete. 
This implies that PI. 2 is also NP-complete, as tc G Inst{p) iff #tc# G 
Cont{ij^pif) where # is a fresh letter. This NP-completeness result immediately 
shows that the word case appears to be much more difficult, as Pl.l and PI. 2 
are polynomial problems in the tree case, of low polynomial degree. However, if 
pattern p is linear, Pl.l and PI. 2 can be solved in linear time, as they actually 
reduce to the well-known string matching problem, and can be solved, e.g., by 
the Knuth-Morris-Pratt algorithm [CR95]. In the general case, the naive algo- 
rithm solving Pl.l is in 0(|tc|"^) (respectively 0{\w\‘^^‘^) for PI. 2), where A is 
the number of distinct variables in p. Neraud [Ner95] showed how this complex- 
ity can be slightly reduced (roughly, the exponent can be decreased by 2) and 
obtained some specialized efficient algorithms for PI. 2 for the cases of low A (1 
or 2). 

Proposition 4. In the word case, problems Pl.l and PI. 2 are NP-complete. 
Both problems can be solved in linear time if pattern p is linear. 

The difficulty of matching problems Pl.l in the case of words can be also 
illustrated by the fact that if a word w is matched by a pattern p, that is tc = 
cr{p), then substitution a does not have to be unique. Eor example, pattern xy can 
match a word w in (|tc| — 1) different ways, corresponding to the factorizations 
of w into two parts. It is easy to see that many patterns admit this situation 
(e.g. all linear patterns), but not all of them - for example, patterns x, xx 
(and more generally, one-variable patterns) have a unique way to match a word. 
Eormally, a pattern p is called non-ambiguous if there is a unique way for p 
to match each word of Inst{p), and ambiguous otherwise. The ambiguity of 
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patterns was studied by Mateescu and Salomaa [MS94]. They introduced the 
notion of degree of ambiguity of a pattern p defined as the maximal number of 
ways for p to match a word from Inst{p) provided this number is finite; otherwise 
the degree of ambiguity is oo. It is easy to exhibit patterns with the degree of 
ambiguity 1 or oo, and much more difficult with a finite degree of ambiguity 
different from 1. In [MS94], it was shown that pattern p = xabxbcayabcy has the 
degree of ambiguity 2. For example, there are two ways for p to match the word 
caabcabcaabcbcabcabcbc, and any word from Inst{p) is matched by p in at most 
two ways. The authors also found a pattern of degree of ambiguity 3, and by 
some composition technique, patterns of any degree 2™3"'. However, they state 
it as an open question if every finite degree of ambiguity is realizable by some 
pattern. The decidability status of determining if the degree of ambiguity of a 
pattern is finite, is also open. 

Let us now turn to problem P3. A striking result has been proved in [JSSY93]: 
inclusion Inst{p) C Inst{S) is undecidable even if S consists of a single pattern. 
This contrasts to the fact that the equivalence problem Inst{p\) = Inst{p 2 ) is 
trivial: the equivalence holds iff p\ and p 2 are equal modulo a variable renaming. 
The latter is however true only in the NE-case, and for the E-case the decidability 
status of the equivalence problem Inst{p\) = Inst{p 2 ) is open. We also point out 
to paper [Fil88] for some results about the inclusion problem Inst{p\) C Inst{p 2 ) 
in the E- and NE-case. 

Proposition 5. In the word case, problem PS is undecidable even if S consists 
of a single pattern. 

Formally, the undecidability result of [JSSY93] for problem P3 does not imply 
the undecidability of problem P4 (see the discussion in the beginning of this 
section). Problem P4 has been studied in [KR95b], where it has been proved 
undecidable. 

Proposition 6. In the word case, problem Pf is undecidable. 

An interesting feature of the proof of [KR95b] is that it implies that the 
problem Inst{p) C Cont{S) remains undecidable if p has a very simple form, 
namely the form axa, where a is a letter and x a variable. It seems very difficult 
(if at all possible) to further simplify p. We will come back to this issue below. 

Based on the proof of the result of [KR95b], we now establish a new result. 

Theorem 3. In the word case, problem P2.1 is undecidable. 

Proof. We give a very general idea of the proof. To reconstruct the details, the 
reader is referred to [KR95b]. 

First, we review the proof of [KR95b] of Proposition 6. To show that 
Inst{axa) C Inst{S) is undecidable, the construction of S is based on the fol- 
lowing idea. The instances of p = axa are assumed to encode runs of a given 
deterministic Minsky (two-register) machine M on a given data d. Patterns of 
S are designed in such a way that every instance of p which does not encode a 
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correct run of machine M on data d, contains some pattern from S. To put it 
in another way, an instance of p which does not contain any pattern of S, must 
encode a correct finite run of machine M on data d. Therefore, there exists an 
instance of p which does not contain an instance of S' iff M halts on d, which is 
an undecidable property. 

To prove Theorem 3, we modify the proof as follows. We modify the set of 
patterns S in such a way that S encodes only a Minsky machine M, and does 
not specify any input data d. Assume that S' is the modified set of patterns. 
Consider now the set of patterns 

S = {q;x|q; G a, a a} U {xq;|q; G A, a yf a} U 

{xpy\p G S' and x, y do not occur in p}, (1) 

where a is the same letter as in the pattern p above. From the previous discussion, 
it is clear that the words which are in Inst{S) are words of the form awa, which 
are not instances of S' . By construction of S', these are words which encode a 
correct finite run of the machine M on some input data. Since it is undecidable 
if a machine stops on a finite number of input data, it is undecidable if the set 
Inst{S) is finite or not. 

The decidability status of Problem P2.2 is open [KMPS95]. The inverse prob- 
lem, whether a given regular language is expressible as Inst{S) is also not known 
to be decidable. It is also open if it is decidable for a language Inst{S) to be 
context-free. However, it was proved in [KMPS95] that it is undecidable if a 
given context-free language is expressible as Inst{S). 

Let us now consider problem P5.1. The proof of Theorem 3 above may suggest 
that P5.1 is not so much different from P2.1 and must be also undecidable by a 
similar proof. Indeed, all “important” patterns occur in the third set of (1), and 
patterns in the first and the second sets are extremely simple - they consists 
of a single letter followed or preceded by a variable. However, these “extremely 
simple” patterns play a crucial role as they actually specify the first and last 
letter in the words of the language, which is necessary for an undecidability proof 
(see [KR95b]). 

The decidability status of Problem 5.1 is open. Actually, it is the most general 
version of the famous avoidability problem. The avoidability problem was studied 
in the word combinatorics under a very restricted form - when S contains a single 
pattern p, and moreover, p contains only variables and no letters. However, even 
in this restricted form the problem turns out to be extremely difficult. 

It is not known if testing the finiteness of Cont{p) is decidable or not. The 
author of [Cur93] offered 100 US dollars^ for a solution of this problem. 

A pattern p is called unavoidable {blocking according to the terminology of 
[Zim84]) if Cont{p) is finite, and avoidable otherwise. Clearly, p is avoidable iff 
there exists an infinite word which does not contain (finite) subwords which are 
instances of p. 

2278.78 russian rubles as for February 12, 1999 
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Interestingly, a study of avoidability is historically at the origin of word com- 
binatorics and formal language theory. Back to the beginning of the century, 
Axel Thue obtained his famous construction of an infinite square-free word on 
the three-letter alphabet and an infinite cube-free word on the two-letter alpha- 
bet. In the terminology of pattern avoidance, a square-free and cube-free word 
is a word which does not contain respectively the pattern xx and xxx. Trivially, 
XX is unavoidable on two letters and xxx is unavoidable on one letter. A pattern 
which is avoidable on four letters but not on three letters has been described in 
[BEM79]. No pattern is known which is avoidable on k letters but unavoidable 
on A: — 1 letters for A: > 4. 

The above discussion shows that the size of the alphabet may be crucial in 
avoiding patterns. We refer to [Cas94] for a survey of the state-of-the-art in pat- 
tern avoidance. A key result in the area is an algorithm proposed independently 
in [BEM79, Zim84], which decides if there exists an alphabet on which a given 
pattern can be avoided. However, as was mentioned above, it is not known if 
for a fixed alphabet one can decide, given a pattern, if it is avoidable on this 
alphabet. 

The rest of the paper is devoted to analyzing some of our problems in case 
the set S consists of linear patterns. We already mentioned that problems Pl.l 
and PI. 2 can be efficiently solved if p is a linear pattern. For the other problems 
we will see that although they become decidable in the linear case, they remain 
untractable. 

Note that if S consists of linear patterns, the languages Inst{S) and Cont{S) 
are regular languages specified by a regular expression of the form 

{A*)wiiA*Wi2 . . . A*Wik, (A*), (2) 

where Wij’s are words and parenthesis indicate that A* may or may not occur 
in the beginning and the end of the expression. Thus, problems P2.2 and P5.2 
are always positively answered. Note also that inclusion and equivalence of reg- 
ular languages specified by general regular expressions is a PSPACE-complete 
problem (cf [GJ79]). 

In [KR95a] problems P4 and P5.1 have been studied under the condition that 
the patterns of S are linear. As for P4, it has been proved that it is decidable 
in this case, regardless if p is linear or not. If p is restricted to be linear too, the 
problem has been proved to be co-NP-complete [KR95a]. The exact complexity 
of the case when patterns of S are linear but pattern p is not, is not known to 
us. However, if the maximal number of occurrences of a variable is bounded, the 
problem remains co-NP-complete. 

Proposition 7. Problem Pf of testing Inst{p) C Cont{S) is decidable if S con- 
sists of linear patterns. If p is linear in addition, the problem is co-NP-complete. 

It was also proved in [KR95a] that if S is restricted to contain linear patterns 
only, problem P5.1 is co-NP-complete too. 

To move on, we need to sketch the co-NP-completeness proofs from paper 
[KR95a]. Consider problem P4 for the case that pattern p and all patterns of 
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5 are linear. The co-NP-hardness of this problem is easy to show. We refer to 
[KR95a] for the reduction from MONOTONE-ONE-IN-THREE-SAT. However, 
proving the membership in co-NP represents a non-trivial part. It amounts to 
show that if Inst{p) % Cont{S), there is an instance of p of size polynomial on 
(|S'| + IpI) which does not co ntain any pattern from S. Of course, the language 
Cont{S) and its complement Cont{S) are regular, as Cont{S) has form (2). The 
proof of [KR95a] consisted of defining a compact deterministic finite automaton 
(DPA) for these languages verifying the following key property: although the 
total size (number of states) of this automaton is exponential in [S'!, the length 
of the longest loop-free path from the initial to the finite state is of polynomial 
length. We refer to [KR95a] for further details. 

This property of the automaton allowed to show that in case Inst{p) % 
Cont{S), the minimal size of an instance of p which is not in Cont{S) has a 
size polynomial on [S'!. Similarly, if Cont{S) is finite (problem P5.1), we can 
give a polynomial bound on the length of words in Cont{S). This provides a key 
argument in the co-NP-completeness proof. 

Here we use this argument to show the co-NP-completeness of two other 
problems - P3 (in case p is a linear pattern) and P2.1. 

Since P3 is a more general problem than P4 in the word case, P4 is co-NP- 
hard if p is a linear pattern. Similarly, P2.1 is more general than P5.1 and is then 
also co-NP-hard. To prove that both of them are in co-NP, we use an adaptation 
of the deterministic automaton construction from [KR95a] from the language 
Cont{S) to Inst{S). We skip the details of the construction which would require 
us too much space, and summarize the results in the following statement. 

Theorem 4. Assuming a linear pattern p and a set of linear patterns S, prob- 
lems P2.1, P3, Pf and P5.1 are co-NP-complete. 

Finally, for a linear pattern p, following [Shi82], we can build a DFA rec- 
ognizing Inst{p) in polynomial (linear) time: if p = {xo)u\X\ . . .Xn-iUn{Xn) 
{ui G A^,Xi G X), the idea is to build DFA’s D\, . . . , Dn recognizing respec- 
tively Cont{u \), . . . , Cont{Un), and then to identify the final state of Di with the 
initial state of D^+i. This construction implies, in particular, that for the special 
case of P3 and P4 where p is linear and S consists of a single linear pattern, a 
solution can be obtained in polynomial time: the question Inst{p) C Inst{p') 
is equivalent to the emptiness of the language Inst{p') D Inst{p) whose DFA is 
easily derived in polynomial (quadratic) time [HU79]. 

Proposition 8. Assuming a linear pattern p and S = {p'} with p' a linear 
pattern, P3 and Pf ean be ehecked in polynomial time. 

6 Conclusions 

In this paper we formulated several language-theoretic problems which are mean- 
ingful for any combinatorial structure equipped with a notion of pattern and a 
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substructure relation. We then studied the algorithmic complexity of those prob- 
lems for two particular structures - trees over a finite signature and words over 
a finite alphabet. It turns out that the instances of these problems for words 
and trees cover a large area of research, including seemingly quite unrelated 
subareas. Some problems on trees have been studied in term rewriting theory, 
with relation to the theory of tree languages. Some other problems, such as tree 
matching, have received much attention in the area of algorithm development. 
Applied to words, those problems have been studied in the area of word com- 
binatorics and formal language theory, including the recent research stream on 
pattern languages. Again, the matching problem for words has been subject of 
intensive studies in the algorithmics area. We found it interesting that all these 
problems can be expressed uniformly as classical problems on languages specified 
by patterns. 

We attempted to give a brief survey of considered problems, putting the 
stress on comparing the tree and the word case. Moreover, we gave several new 
results for the word case. We showed that all problems are easier on the tree case 
than their counterparts for the word case. In particular, except for the matching 
problem, all problems are decidable in the tree case and undecidable in the word 
case. For the word case, we gave a special attention to the linear case, where the 
problems become decidable but, as we have showed, remain of high algorithmic 
complexity. 
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Abstract. Monads are a technique widely used in functional program- 
ming languages to address many different problems. This paper presents 
extensions, a functional-logic programming technique that constitutes an 
alternative to monads in several situations. Extensions permit the defi- 
nition of easily reusable functions in the same way as monads, but are 
based on simpler concepts taken from logic programming, and hence they 
lead to more appealing and natural definitions of types and functions. 
Moreover, extensions are compatible with interesting features typical of 
logic programming, like multiple modes of use, while monads are not. 



1 Introduction 

Functional- Logic programming, FLP in short, aims to integrate of functional and 
logic programming, allowing the use of teehniques from both paradigms into the 
same declarative framework (see [5] for a survey). Moreover, the combination 
of ideas of the two worlds gives rise to new features specific to FLP. This work 
should be seen as a contribution in this direction, for it presents a new technique, 
the extensions, that can be used as an alternative to the functional technique of 
monads when programming in a functional-logic language. 

The concept of monad comes from category theory, and it has been widely 
used in functional programming to structure functions, pointing out the essence 
of the algorithms represented while concealing the data flow and the associated 
computations [13,14,15]. 

In several FLP frameworks such as Escher [9], Curry [6] or our working lan- 
guage, Toy [3,12], monads can be used directly, yielding the same benefits as 
in the case of functional programming. However, FLP has a wider range of pro- 
gramming mechanisms, including logical variables, and it should be questioned 
whether it is possible to define a specific FLP technique to address the same 
kind of problems from a different point of view. In the rest of the paper we de- 
scribe such an alternative, the FLP extensions. Although lacking the theoretical 
background and wide range of applications of monads, extensions present some 
specific advantages, such as: 
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• Extensions can replace monads in several different situations, allowing the 
same expressiveness but using much simpler concepts. 

• Multiple modes of use are allowed by extensions, which is not so easy to achieve 
when defining monads in an FLP context. 

• In the case of adding new features to functions, monads enforce the evalua- 
tion of both the old and the new values simultaneously. Conversely, extensions 
can use the new feature only where it is required, thus avoiding unnecessary 
computations. 

2 The FLP Framework: A Succinct Description of T Oy 

All the programs in the next sections are written in the purely declarative 
functional-logic language TOy, which is a concrete realization of CRWL, a the- 
oretical framework for declarative programming (see [4]). We present here only 
the subset of the language relevant to this work. A more complete description 
and a number of representative examples can be found in [3]. 

A Toy program consists of datatype, type alias, infix operator definitions, 
and rules for defining funetions. Syntax is mostly borrowed from Haskell [7], with 
the remarkable exception that variables begin with upper-case letters whereas 
constructor and function symbols use lower-case. 



infixr 20 :/: 




data expr = val real 


expr :/:expr 


eval; : expr ^ real 




eval (val A) = A 




eval (A :/:B) = (eval 


A) /(eval B) 



Fig. 1. Monadic variations of the basic evaluator 



Our first example of a program written in T Oy may be seen in figure 1. This 
program is the T Oy version of the evaluator for simple expressions presented by 
P. Wadler in his article [15], and will be our starting point in order to compare 
monads and extensions. The evaluator itself is represented by function eval, 
which takes an expression E as the only input parameter, and returns the real 
number resulting from evaluating E. An expression can be either a real number 
r, represented as val r or a quotient between expressions e\ and e^, represented 
as ei :/: e-i- 

In general, each function f in T Oy is defined by a set of conditional rules 
of the form 
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i ... t^i 6 > ' ' ' > 

where (ti...t„) forms a tuple of linear (i.e. with no repeated variable) con- 
structor terms, and e, e^, e) are expressions. No other conditions (except well- 
typedness) are imposed to function definitions. Rules have a conditional reading: 
f ti . . .tn can be reduced to e if all the conditions ei == e'^, . . . , == e). 

are satisfied. The condition part is omitted if A: = 0 (as in our previous example 
eval). The symbol == stands for strict equality, which is the suitable notion for 
equality when non-strict functions are considered. With this notion a condition 
e == e’ can be read as: e and e' can be reduced to the same constructor term. 

Toy can introduce non-deterministic computations by different means, but 
we only need one of them for this discussion, namely the occurrence of extra 
variables in the right side of the rules like in 

z_list = [0|L] 

Although in this case z.list reduces only to [0|L] , the free variable L can be 
later on instantiated to any list. Therefore, any list of integers is a possible value 
of z_list. 

Computing in T Oy means solving goals, which take the form 

Cl == 6]^ , • • • , Ck —— e^ 

giving as its result a substitution for the variables in the goal making it true. 
Evaluation of expressions (required for solving the conditions) is done by a vari- 
ant of lazy narrowing based on a sophisticated strategy, called demand driven 
strategy which uses the so-called definitional trees [2] to guide unification with 
patterns in left-hand sides of rules (see [8]). For instance, using the evaluator 
defined above we may try the goal: 

eval (val 16 :/:val 4 :/:val 1 :/:val 8) == R 
which yields R == 0.5. 

As an aside, we remark that the current version of our language does not 
incorporate lambda abstractions or let constructions. However, these syntactic 
facilities are usual in the functional programming literature, and we have in- 
cluded them in some of our examples in order to fairly represent the monadic 
approach. For testing the examples in the actual implementation, we have simply 
needed to ‘lift’ such constructions using well-known techniques [10]. 



3 Funcional-Logic Monads 

In this section we present two variations of the basic evaluator, following the lines 
of Wadler’s paper [15] . We also recall briefly some of the basic concepts concerned 
with monads, which will be useful when comparing monads and extensions. 
However, we will not delay very much at this point, assuming that the definition 
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and usefulness of monads are well-known, and referring to the cited article for a 
deeper discussion of these issues. 

To convert a function f::A B to monadic form we change its type to f::A 
m B, meaning that function / accepts a parameter of type A and returns a value 
of type B, with an associated computation represented by m. The structure of 
the function will be based on the functions unit:: A ^ m A (also known as result) 
and (*)::m A^ (A ^mB)^mB (usually called bind) and indicates how the 
value B is constructed, avoiding any explicit reference to the computation m. 
Only unit and * (and perhaps some auxiliary functions) will ‘know’ what m is 
actually, and how to deal with it. If we want to add some extra capabilities to the 
original code of / later, we only need to look for an appropriate data constructor 
m’ that captures the essence of the modification. Then we redefine the type of 
the function to f::A m’ B, define the new versions of * and unit and, perhaps, 
make a few local changes in the code of the function itself, but always keeping 
the same basic structure. 

Figure 2 shows two ‘classical’ variations of the original evaluator. 



type state = int 


type output = string 


type m A = state ^ (A, state) 


type m A = (output, A) 


unit; : A ^ m A 
unit A X = (A,X) 


unit; ; A ^ m A 
unit A = (" " , A) 


infixr 30 * 

(*) ; ;m A ^ (A ^ m B) ^ m B 
(*) M K S = let (A,S2) = M S 
in K A S2 


infixr 30 * 

(*) ; ;m A ^ (A ^ m B) ^ m B 
(X,A) * K = let (Y,B) = K A 
in (X++Y,B) 


tick : : m () 
tick X = (0 ,X+1) 


out ; ; output ^ m ( ) 
out X = (X, 0) 


eval; : expr ^ m real 
eval (val A) = unit A 

eval (A :/:B) = 

eval A * ARI . eval B * AR2 . 
tick * A() . unit (R1/R2) 


eval; ; expr ^ m real 
eval (val A) = out (line (val A) A) 
* A() .unit A 
eval (A ;/;B) = 

eval A * ARI . eval B * AR2 . 
out (line (A ;/;B) (R1/R2)) 

* A() .unit (R1/R2) 



Fig. 2. Monadic variations of the basic evaluator 
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The first variation, is based on the very useful state monad, taken from [15] 
and adapted to T Oy syntax, which is used to count the total number of divisions 
performed while evaluating the expression. The second variation produces a trace 
of the evaluation. This last variation uses a function line which produces a step 
of the trace and may be defined as: 

line T R = "eval(" ++ showterm T ++ " ++ 

number _to .string R ++ "\n" 

assuming suitable definitions for showterm and number_to_string. The infix 
operator ++ is the standard function for concatenation of lists. It can be seen 
that the basic structure of eval is kept almost unaffected. If we had modified 
the initial code directly, this would have been more difficult to achieve. 

4 Functional-Logic Extensions 

In the previous section we have sketched how the monadic approach can be 
adopted in T Oy. Now it is time to present the alternative provided by our FLP 
extensions. 



4.1 An Informal Introduction to Extensions 

The idea of FLP extensions is quite simple, and constitutes itself a good example 
of mixing the resources of logic and functional programming: 

Suppose we would like to add a new capability of type C to a given function 
f::A B. Then, all we need to do is to extend the type of the function to f::A 
B ^ C, meaning that the old returned value is now an output parameter, while 
the new value is introduced as the result of the function. 

Consider the initial basic evaluator and suppose we want to enrich the capa- 
bilities of the function 



eval::expr ^ real 

by associating a new value of type C to the currently returned real number. Then, 
we extend the function with the new feature, changing its type to 

eval::expr ^ real ^ C 

Of course the definition of eval also needs to be modified, acknowledging that 
the result of the evaluation is no longer the result of the function, but an output 
parameter. 

In order to hide the way the values of type C are composed we define a 
combinator 



C 
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Hence the second rule for eval will have the shape 

eval (A :/:B) R = eval A R1 * eval B R2 ... 

with the values R, R1 , R2 standing for the result of the evaluation of A :/: B. A 
and B respectively. The problem of constructing the new result of the function 
seems to be solved: eval A R1 and eval B R1 are actual values of type C related 
to the ‘old’ values R1 and R2, and therefore can be combined by using *. If later 
we change C by C' we only need to change the definition of * but not the basic 
structure of eval. 

However, we still need to associate the value R1/R2 with the result of the 
evaluation R. This will be performed by function unit, which must ‘identify’ R 
and R1/R2. In order to generalize the definition to other situations, both values 
R and R1/R2 will be input parameters of unit. The logical way of adding unit 
to the definition of eval is simply by using *: 

eval (A :/:B) R = eval A R1 * eval B R2 * unit (R1/R2) R 

This means that unit should return a value of type C and, since we said above 
that the result of the functions was already properly constructed by eval A R1 
=t= eval B R2 , the value of unit must be a truly unit value with respect to the 
operation *. Therefore given a unit element e of type C, we can define unit as 

unit:: real ^ real ^ C 
unit A A = e 

where the repeated variable is just a ‘syntactic sugar’ of 

unit A B = e <= A==B 

That is, unit returns e if the strict equality A==B succeeds. This produces the 
desired identification between the result R and R1/R2. 



4.2 Extensions of the Basic Evalnator 

The ‘extension counterpart’ of the monadic variations presented in the previous 
section may be seen in figure 3. The type C of our discussion is represented 
respectively by the types trans and output, while the unit elements are id and 
" ", where the standard function id is defined as usual: 

id X = X 

Further details about these examples may be found in section 5. 

4.3 Definition of Extension 

A FLP extension is a tuple (b, unit, *) where b is an specific type, unit is a 
function of type A ^ A ^ b and definition unit A A = e, e G b, and where * 
is a function of type b ^ b ^ b such as (e, *) is a monoid. 
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type state = int 

type trans = state ^ state 

unit; ; A ^ A ^ trans 
unit A A = id 

infixr 30 * 

(*);; trans ^ trans ^ trans 
(*) M K S = K S2 M S == S2 

tick ; : trans 
tick = (1+) 

eval; ; expr ^ real ^ trans 
eval (val A) R = unit A R 

eval (A :/:B) R = 

eval A R1 * eval B R2 * 
unit (R1/R2) R * tick 



type output = string 

unit; ; A ^ A ^ output 
unit A A = "" 

infixr 30 * 

( * ) ; ; output ^ output ^ output 
M * K = M -H- K 

out ; ; output 
out = id 

eval; ; expr ^ real ^ output 
eval (val A) R = unit A R * 
out (line (val A) A) 
eval (A :/: B) R = 

eval A R1 * eval B R2 * 
unit (R1 / R2) R * 
out (line (A :/:B) R) 



Fig. 3. Extensions of the basic evaluator 



Now it can be proved easily that the variations of figure 3 are actually exten- 
sions. For example, the pair ( “”,-f + ) used in the output extension is known to 
satisfy the properties of monoids. The proof for the other case is quite straightfor- 
ward. Although this definitions lacks the theoretic background of the definition 
of monad, the structure of monoid is enough to prove some simple assertions 
about the functions defined using * and unit in the same line as that of [15]. 

5 A Comparative Survey 

So far we have presented two ‘classical’ variations of the basic evaluator, using 
both extensions and monads. Now we can present a first comparative study of 
the two techniques. In the following points we show some of the advantages of 
using extensions that can be checked directly in the examples. 

• The definitions of types for extensions are simpler than in the case of monads. 
Indeed, we do not need to worry about how to combine the old and the new 
value, while monads need to define a suitable type constructor m. For example, 
in order to add the output trace to the basic evaluator, we have defined the type 



type output = string 
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while the monadic version needs also define 

m A = (A, output) 

• As a consequence of the previous point, functions unit and * admit simpler 
definitions. For instance 

(*) : : output ^ output ^ output 
M * K = M ++K 

indicates that the result of combining two outputs is the concatenation of both 
of them. Observe, in particular, the symmetrical aspect of the type of (*). This 
definition seems more readable than the monadic variation: 

(*) ::mA^ (A^mB) ^mB 

(X,A) * K = let (Y,B) = K A in (X++Y,B) 

• The symmetrical definition of * also entails some practical consequences, as it 
allows the programmer to change the order of the combined values. Thus we do 
not need to end the sequence with a unit expression, as in the case of monads. 
For instance, take the second rule for eval in the output monad: 

eval (A :/:B) = eval A * ARl . eval B * 

AR2. out (line (A :/:B) (R1/R2)) * 

AO . unit (R1/R2) 

It would better to change the order of unit and out, writing instead 

eval (A :/:B) = eval A * ARl. eval B * AR2 . unit (R1/R2) 

*AR. out (line (A :/:B) R) 

avoiding the unnecessary repeated calculation of R1/R2 and separating the side 
effect from the main computation, but this is not possible without changing the 
definition of out. However the definition of * for extensions allows us to write 

eval (A :/:B) R = eval A R1 * eval B R2 * unit (R1/R2) R 
* out (line (A :/:B) R) 

where R1/R2 is computed only once. 

• The separation between the old and the new values also benefits the definitions 
of auxiliary functions such as tick or out. For example, as tick must increase 
the state we need only write 



tick = (1+) 



instead of the monadic definition 



tick X = (() ,X+1) 
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These straightforward definitions also avoid the useless dummy variables and 
values 0 that appear in the monadic definitions. 

Of course, extensions have some disadvantages like any other programming 
technique. We can point out the following drawbacks: 

• Monads are a more abstract technique. They are based upon deep theoretical 
results and can be applied to a number of different areas beyond programming, 
such as type inference or semantics, while extensions are hitherto just a specific 
methodology of FLP. 

• Some monads cannot be thought of in terms of extensions, because they are 
not meant to add new values to a previously given function. For instance, lists 
may be seen as a monad (see [15]), while they cannot be defined in terms of 
extensions. 

Therefore, extensions cannot be applied to the same situations as monads. 
And, can monads substitute extensions? In Section 6 we will present some ap- 
plications of extensions that cannot be accomplished by monads, hence showing 
that neither of both techniques may be subsumed into the other one. 

6 Other Features of Extensions 

Extensions and monads look quite similar, but actually they can be used to 
solve different problems. We have pointed out in Section 5 some limitations 
of extensions. Now we are going to show how extensions can be used in two 
situations where monads cannot be readily applied. 



6.1 Avoiding Unnecessary Computations 

Monads (as well as extensions ) allow one to increase the capabilities of functions 
while keeping their basic structures unaffected. Of course, these extra features 
also entail extra computation time. The efficiency of the two techniques is quite 
similar (both in time and space) when the extra features are computed. However 
the situation changes remarkably in the points of the program where still only 
the old value of the function is required. This may be specially extreme when 
dealing with the state monad (or extension). 

Imagine for example that we need a variation of the evaluator of expressions 
that not only computes the resulting real number but also maintains an ordered 
list with the numbers that appear in the expression. Such variation may be seen 
in figure 4 using monads and extensions 

with the function insert defined as usual. Functions *, unit and types m A 
and trans have not been included for they are those of the state variations we 
showed before (figures 2 and 3). Here function tick is used to insert an element 
in the ordered list, while the initial state is the empty list. For example, using 
extensions we may try 



eval (val 8 :/:val 4 :/:val 2) R [] == L 
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type state = [real] 


type state = [real] 


tick : : real ^ m () 


tick : : real ^ trans 


tick A S = ( () , insert A S) 


tick A = insert A 


eval; : expr ^ m real 


eval; ; expr ^ real ^ trans 


eval (val A) = tick A * A() .unit A 


eval (val A) R = tick A * unit A R 


eval (A :/:B) = eval A * 


eval (A ;/;B) R = eval A R1 * 


ARI . eval B * 


eval B R2 * 


AR2. unit (R1/R2) 


unit (R1/R2) R 



Fig. 4. Evaluator yielding an ordered list, using monads (left) and extensions 
(right) 



which returns the values 



R == 4 
L == [2,4,8] 

However, it is possible that we might still need to evaluate expressions just 
to get the result, dismissing the list. In this case, the insertion of all the elements 
in the list is an unnecessary overweight that should be avoided. Using extensions 
this can be done by simply not providing the initial state [] to the goal. Then 
the result of evaluating the expression is computed as usual, but the state is 
returned as a ’chain of actions’ not evaluated yet, as is witnessed by the goal 

eval (val 8 :/:val 4 :/:val 2) R == L 



that returns 
R == 4 

L == (insert 8 * id) * ((insert 4 *id) * (insert 2 *id) *id) *id 

Thus the actual insertion in the list is not carried out, and we can define a 
function eval ’ as 



eval ’ Expr = R <= eval Expr R == _ 

Note that this cannot be done by using monads, because the two values, the 
numeric result and the list are actually parts of a single value. Effectively, if we 
do not provide the initial state to the monadic variation, a goal like 



eval (val 8 :/:val 4) == L 
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yields an expression of the shape 

L == (tick 8 * A() .unit 8) * ARl . (tick 4 * A() .unit 4) * 

AR2 . unit (R1/R2) 

because functions tick, unit and * cannot be reduced until a initial state is 
provided. Thus we can either compute both the result and the ordered list, or 
neither. 

The use of the function eval ' whenever the list is not required can speed up 
the program considerably. Checked with a expression of 300 numbers, we have 
found out that the differences of time between eval ' and eval using extensions, 
can vary from 0'38s to 5'lOs. And, despite the big chain of insert and id 
functions that eval' must construct, the space required is also less than in the 
case of actually performing the insertions with eval. 



6.2 A Parser for Free 

Consider the boolean expressions defined as 

infixr 20 : /\ : 
infixr 15 : \/ : 

data expr = val bool | expr : /\ : expr | expr : \/ : expr 

Suppose that we decide to define a evaluator evalb for this expressions, 
returning not only the result of the evaluation, but also a suitable representation 
of the expression. The code for such function may be seen in the figure 5, using 
monads (left side) and using extensions (right side), and is a simple application 
of the output feature presented before. 

Functions or and and are defined as usual in functional programming, while 
function conv may be easily defined as 

conv true = "T" 
conv false = "F" 

For example, using the monadic variation, we may try 

evalb (val true : /\ : (val false : \/ : val true)) == R 
which returns 



R == ( " (T and (F or T))" , true ) 

Suppose now that, after evaluating a few expressions using the new variation, we 
decide that representations like "(T and (F or T))" are definitely nicer and 
more readable than 



evalb (val true : /\ : (val false : \/ : val true)) 
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evalb: : expr ^ m bool 

evalb (val A) = out (conv A) * 
A unit A 

evalb (A : \/ : B) = 

out "(" * A(). evalb A * 
A Rl. out " or " * A(). 
evalb B * AR2. out ")" * 
A() . unit (Rl ‘or‘ R2) 

evalb (A : /\ : B) = 

out "(" * A(). evalb A * 
A Rl. out " and " * A() . 
evalb B * AR2. out ")" * 
A() . unit (Rl ‘and' R2) 



evalb: : expr ^ bool ^ output 

evalb (val A) R = out (conv A) * 
unit A R 

evalb (A : \/ : B) R = 

out " ( " * evalb A Rl * 
out " or " * 
evalb B R2 * 

unit (Rl ‘or' R2) R * out ")" 

evalb (A : /\ : B) R = 

out " ( " * evalb A Rl * 
out " and " * 
evalb B R2 * 

unit (Rl 'and' R2) R * out ")" 



Fig. 5. Boolean evaluator with output, using monads and extensions 



and that we would like to define a version of evalb accepting strings representing 
expressions as input parameter. Does it mean that now we need to define a 
parser for boolean expressions? The answer is no, if we use extensions. Indeed, 
the extension of the boolean evaluator showed in the figure 5 can be used as a 
parser without making any changes, as witnessed by the goal 

evalb Expr R == " (F and (F or T))" 
which succeeds with 

Expr == val false : /\ : (val false : \/ : val true) 

R == false 



This nice outcome of extensions is an example of the generate & test techniques, 
very usual in logic programming. Therefore, ours is actually a recursive top- 
down parser of the grammar rules expressed in evalb by means of output (for 
terminals) and recursive calls of evalb (for non- terminals) . 



But, why is it not possible to use the monadic variation in this case? It is 
due to the combination of the string representation and the output value, which 
is a free variable. For example, the goal 

evalb Expr == ("(F and T)",R) 

loops. We must recall that strict equality does a ‘careful matching’ as we showed 
before. In the example, this means generating the outer constructor of both " (F 
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and T) " and R by means of evalb Expr. But getting an outer constructor for 
R entails generating a whole expression, and by using the second rule of evalb, 
infinite expressions may be generated. These expressions, all of which have an 
or in their representations, when finally compared with (F and T), fail. 

7 Conclusions 

We have shown throughout this paper that extensions are a suitable mechanism 
to solve a number of problems when working in a functional-logic language. 
Although lacking the deep theoretical background of monads, extensions can 
be used as an alternative to define easily reusable code. The concepts used are 
simple, and were already known in each declarative paradigm, such as the use 
of arguments in logic programming to return output values, or the definition of 
higher order combinators (e.g. *) in order to connect different computations in 
sequence. The novelty of our approach is that it combines techniques of both 
main declarative streams, yielding a new mechanism that allows us to address 
problems, as the addition of new features to functions, in a simple and appealing 
way. Specifically, extensions avoid the necessity of lambda abstractions, provide 
a more symmetric definition of the combinator * - from the point of view of 
types - and lead to nicer and more natural definitions of types and auxiliary 
functions. 

In spite of all the resemblances, extensions and monads are different tech- 
niques, each one with its own particularities and limitations. An advantage of 
extensions is that they provide functions with the possibility of multiple modes 
of use, therefore defining functions that can be reused in a wider sense than in 
the case of monads. Another advantage is that the state extension allows one 
to dismiss the stateful computations whenever they are not interesting, hence 
saving both time and space. 
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Abstract. The present work is oriented on the description of the el- 
ementary educational informatics based on the programming support 
environment. The environment structure and its components developed 
on national languages are being investigated. The problems of language 
tools and program development system as well as computer support and 
informatics systems are being studied. 



1 Introduction 

Educational informatics provides elementary computer knowledge [1]. This course 
should be supplied with a special teaching conception (model). Educational in- 
formatics can be considered as an elementary subject (such as mathematics, 
physics, biology), but it has two peculiarities which should be taken into consid- 
eration: 

1. Methodological and technological basis and methods of teaching are being 
rapidly changed; 

2. It needs constant support with special technical, language and programming 
means. That’s why the chosen teaching model and implementation of its 
program language support should take into account these peculiarities. 

We present one of the educational informatics models and introduce the 
ways of developing language tools and programming systems aimed to support 
this model. The paper consists of three parts and a conclusion. The first part 
deals with the description of informatics teaching model. The peculiarities of 
development of complex components are introduced in the second and third 
parts. 

2 Informatics Teaching Model 

At present different methods are used in conceptual development of informatics 
teaching. One of them is the usage of the vanguard style directed to the study 
of logic-mathematical base of algorithmization and elements of programming. 
Taking this model as the basis, we can offer the informatics teaching model. Its 
general strategy consists in the following five principles: 
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1. Subject-tool informatics usage. Informatics is studied via computer means 
and informatics technology is used to support the universal ways of educa- 
tional activities [1]; 

2. Learners’ qualification. The main accent is made on teaching algorithmics 
and programming; 

3. Knowledge in algorithms base structures and typical programming meth- 
ods. Constant interconnection between algorithmization and programming 
is demonstrated, and ’’smooth” transition from one level to another is shown; 

4. Usage of the native language. Teaching is aimed at the learner’s native lan- 
guage with the use of language-program means and technological methods 
based on national interface; 

5. Multinational tools. Language, program and technological means of study 
are developed with account of their adaptation to various lexicons. 

The functional purpose of such a method and language tools and program- 
ming systems is to satisfy needs of the informatics teaching model for senior 
pupils and junior students. 

3 Language Tools 

Language tools contain a special language used for the description of algorithms 
and a programming language of higher level. 

Algorithmic Language. A special algorithmic language (SAL) [2] has cho- 
sen as the algorithmic language. This language has different notations, close to 
the natural language, algorithms in it can be written and read as a usual text 
and, what is more, the study of this language will help one to get more profound 
knowledge of any programming language in the future. Another important aspect 
of SAL is that its structure is close to algorithmic mentality of a learner, it has 
no goto command, which satisfies the reguirements of structural construction, 
and there are no details connected with the computer device. 

For easy usage and realization needs, some changes and additions were made 
to SAL, such as introduction of two new commands (input and output) which 
are used for intermediate data input and output; specification of dynamic tables; 
the usage of key words without underlining and linear notation of expressions, 
etc., as in modern programming languages (PL). 

Programming Language. We have chosen Pascal/R [3] which has the basic 
Russian notation of Standard Pascal [4]. In general, we can state two main factors 
having effect on defining the language tools. 

The first is the choice of the minimum necessary structures sufficient for 
initial study of algorithms and programming and traditional for many modern 
languages. Some data types, as well as statements and special Pascal functions, 
were excluded. The label, data type set, variant record, goto statement, 
cycle statements of the type for... dowuto and repeat... uutil were reduced 
and the procedures get(/) and put(/) of the processing file / were not used. 
For practical needs data types striug and striugjn] were included into the 
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language, as well as two new procedures of file processing (Close and Assign) of 
the programming language Turbo Pascal. 

The second factor is connected with the national lexicons of the language 
means. That is why they were localized on the other languages (Russian, Uzbek, 
Tajik) on the basis of Cyrillic and Latin graphics. Thus, algorithmics and pro- 
gramming should be studied on the basis of the native language of the learner, 
and input languages of the system should have modifiable lexical structure. 

4 Programming Systems 

The programming complex consists of a specialized system on the basis of SAL 
and a programming system (PS) with the localized input language Pascal. The 
first system is used for computer support of the algorithmic course, and PS, 
for study of fundamentals of programming. Each of these systems are developed 
in the form of an integrated environment with common components. The en- 
vironment is initially produced to support a definite style of constructing and 
debugging of algorithms (programs). Its components are the working window 
with the main menu, editor, compiler, help subsystem and data base (DB). Let 
us briefly discuss the peculiarities of the teaching environment and its compo- 
nents. 

The Program Construction and Debugging Style. The environment is 
oriented at the style of structured construction on the basis of structure editing 
algorithms. It is also supplied with the debugging display and program running 
[5]. The structural construction in educational informatics gives the following 
advantages: 

a) This type of construction is closer to operation mentality of a man. It 
gives the possibility of more adequate description of typical processes in the 
application area of the task with the help of definite integral constructions of 
the programming language; 

b) It combines the ’’strict” requirements to the structural and usual types 
of string editing. The first of them puts limitations on the user actions and is 
useful in training the beginners; 

c) The program becomes an active object at the very beginning and within 
the process of its construction. Partially, it is ready for use even not being com- 
pleted. This provides the program check out which step by step ensures the 
programmer in correctness of his choice and actions. 

Another important mechanism of the education process is visualization of 
debugging and running of a program. To display the process of running, the 
output facilities should be realized in the system so that the text of the original 
program should be presented on display with its increased detailed block-scheme, 
with underlined and coloured areas of constructions’ domains and keywords of 
the language, and values of variables being indicated in the control points. 

Working Panel and Main Menu Console. It is known that the interac- 
tive mode is the basic in the training systems. It provides interconnection with 
the pupil. The program start is followed by displaying the main menu at the 
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top of the window. Menu contains options: File, Editing, Translation, Lexics — 
each can have vertical suboptions. In the lower part of the display there is a 
prompting string showing the coincidence between the functional keyboards and 
actions. Using the option Lexic [suboption], a learner chooses the notation for 
writting algorithms on the basis of SAL (Pascal program) and lexicon of the 
communication system. Further work is fullfilled in this language environment. 

Editor. The embedded system editor has two operation modes — textual 
and structural. The structural mode is the main one, since in the process of 
studying PL much attention is given to its syntactical constructions and rules. 
Editor has both traditional and specialized operations with texts in the struc- 
tural mode, such as call of language constructions which are kept in the form 
of ’’ready-made” pictures, transformation of the source program text into an 
abstract syntactic tree (AST) and vice versa, reorganization of AST, recognition 
of elementary errors and output information on them, creation and modification 
of pictures. At first, the pictures (general structure of algorithms in SAL and of 
Pascal programs, program items, commands, statements, additional algorithms 
and subprograms) are kept in the data base and called when needed and, after 
the processing, ’’loaded” (’’hung”) in a definite place of the abstract tree. 

For example, the construction 

if (condition) then (statementl) else (statement2); 

is entirely produced on display, while the parts put into brackets are easily 
deleted and their place is taken by real constructions. 

Thus, this principle of structural editing gives us the possibility of program 
check at each step of the process of its development, being useful for the learner 
and preventing him from ” making errors” . 

Compiler. The specialized translators of the system on the basis of SAL 
and PS with Pascal input language are developed in the form of an interpreter. 
This method is used to simplify its implementation and possesses the following 
peculiarities. The interpreter opens the perspective of easy process-projecting 
management, debugging and visualization of programs [6]. Using the interactive 
mode, we can write, check and run programs, within the interpreter. Errors can 
be easily corrected. We should not go back to editor and compile the program 
again. Structural construction and editing [7] provides the intermediate repre- 
sentation and interpretation of incompleted programs. Programming systems 
can be used for educational purpose and are not aimed to solve the tasks which 
require immediate actions. 

Help Subsystem and Database System. DB is developed on the basis of 
the electronic textbook ideology. It is supposed to keep the structural informa- 
tion: theoretical, practical and methodological materials (glossaries containing 
the basic notions and terms of input language, the set of type schemes, demon- 
stration algorithms and programs, as well as many tests aimed to control and 
evaluate the trainer’s knowledge). 

The database of environment is organized and kept in the form of a hypertext. 
The data base files contain texts of the main and intermediate representation 
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programs, educational texts and additional information connected with a definite 
lexicon. The support of construction and functioning of the components of the 
help subsystem and the data base is fulfilled by a separate tool system. It contains 
the set of special operations working with the hypertext, information input, 
processing, and output in the DB system. 

5 Conclusion 

The educational model for studying informatics is discussed in this work, as well 
as the questions of the language tools and programming system development 
intended to support this model. The general strategy of teaching is based on 
advanced learning of algorithmics and elements of programming on the basis 
of a native language. The structure of the complex and requirements on its 
components including the language and program facilities and hardware support 
has been defined. 

References 

1. Ershov A. P. Computerization of Schools and Mathematical Education // Proc. of 
the Sixth Intern. Congress on Mathematical Education. Budapest, 1988. P. 49 — 65. 

2. Ershov A. P. Basic Concepts of Algorithms and Programming to Be Taught in a 
School Course in Informatics / / Proc. of the Intern. Joint Conf. on Theory and 
Practice of Software Development (Tapsoft). Berlin, 1985. 14 p. 

3. Kobilov S. S. Programming system Pascal/R. In: Important Problems of Applica- 
tion Mathematics and Economics, Samarkand: SSU, 1997, pp. 73 — 79. (in Russian). 

4. Jensen K., Wirth N. Pascal. User Guide and Report. Springer- Verlag, 1978. 

5. Boltaev T. B., Kuzminov T. V., Pottosin I. V. On Structured Construction and 
Supporting Tools. In: Programming Environments: Methods and Tools, Novosi- 
birsk, 1992, pp. 22 — 37. (in Russian). 

6. Cries D. Compiler Construction for Digital Computers. John Wiley& Sons, Inc., 
N 4, 1971. 

7. Boltaev T. B. Interpreter of Incompleted Programs. In: Programming Environ- 
ments: Methods and Tools, Novosibirsk, 1992, pp. 38 — 50. (in Russian). 




Current Directions in Hyper-Programming 



Ronald Morrison^, Richard C.H. Connor^, Quintin I. Cutts^, Alan Dearie^, 
Alex Farkas^, Graham N.C. Kirby^, Robert McGettrick^, and 
Evangelos Zirintsis^ 

^ School of Mathematical and Computational Sciences, 

University of St Andrews, North Haugh, St Andrews, Fife, KY16 9SS, Scotland 
{ron, graham, vaaigelis}@dcs . st-and. ac .uk 
^ Department of Computer Science, University of Glasgow, 

Glasgow G12 8QQ, Scotland 
{richard, quintin}@dcs . gla. ac .uk 
® Department of Computing Science and Mathematics, 

University of Stirling, Stirling, FK9 4LA, Scotland 
{al, rmc}@cs . stir . ac .uk 
Vision Systems Ltd, 

Adelaide, S.A., Australia 
Alex. FarkasOvsl . com.au 



Abstract. The traditional representation of a program is as a linear se- 
quence of text . At some stage in the execution sequence the source text is 
checked for type correctness and its translated form is linked to values in 
the environment. When this is performed early in the execution process, 
confidence in the correctness of the program is raised. During program 
execution, tools such as debuggers are used to inspect the running state 
of programs. Relating this state to the linear text is often problemati- 
cal. We have developed a technique, hyperprogramming, that allows the 
representations of source programs to include direct links (hyper-links) 
to values, including code, that already exist in the environment. Hyper- 
programming achieves our two objectives of being able to link earlier 
than before, at program composition time, and to represent sharing and 
thus closure and through this the run-time state of a program. This pa- 
per reviews our work on hyper-programming and proposes some current 
research areas. 



1 Introduction 

Fig. 1, taken from [1], shows an example of a Napier88 hyper-program. The 
program source, which is itself a persistent object, comprises text and hyper- 
links to other objects in the persistent store. 

The first hyper-link is to a persistent first-class procedure value writeString 
which writes a prompt to the user. The program then calls another procedure 
readstring to read in a name, and then finds an address corresponding to that 
name. This is done by calling a procedure lookup to find the address in a table 
data structure linked into the hyper-program. The address is then written out. 
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Note that the code objects {readString, writeString and lookup) are denoted 
using exactly the same mechanism as data objects (the table) ^ and all of these 
are external to the hyper-program but within the persistent environment. 




Fig. 1. A Napier88 Hyper-Program 



A requirement for hyper-programming is the presence of an external value 
space to which bindings can be constructed during program composition. The 
external source may be provided by a persistent store, a file system or any 
other mechanism such as the WWW. No matter which external source is used, 
a fundamental change in the nature of the source program has taken place since 
it now contains both text and hyper-links to values in the environment. This 
non-flat representation of the program source challenges our traditional notions 
of what constitutes a computer program. The reason for the name hyper-program 
is the analogy with hyper-text which is also nonflat and contains both text and 
hyper-links to other hyper-text. 

The major issue in building hyper- programming systems concerns the seman- 
tics of the hyper-links, such as: 

• what can a hyper-link refer to? 

• what guarantees can be made about a hyper-links referent data? 

• how are hyper-links typed and when does type-checking occur? 

The degrees of freedom regarding what a hyper-link can refer to depend 
upon the programming language semantics and the measure of openness is the 

^ Note also that the names used in this description of the hyper-links have been as- 
sociated with the objects for clarity only, and are not part of the semantics of the 
hyper-program. 
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system. Normally hyper-links will be able to refer to all language first class 
values. Second class entities, not in the value space such as types, may also 
be conveniently hyper-linked depending on the flavour of the language. Update 
may be accommodated through hyper-links by linking to locations, which may or 
may not be first class values. More interesting is the extent to which hyper-links 
may refer to values created independently of the system, such as Web pages and 
DCOM objects. Furthermore the openness of the system can be extended by 
making the hyper-program representation open for other tools to manipulate. 

Referential integrity in a hyper-programming system means that once a 
hyper-link is established it is guaranteed by the system to exist and to be the 
same value when the hyper-link is executed. While this guarantee may be pro- 
vided by a strongly typed persistent object store, it may also be expensive to 
provide in a distributed system. Variations therefore include the hyper- link be- 
ing valid but not necessarily referring to the original value, and the hyper-link 
referring to a copy of the original. This may only be a problem where object iden- 
tity is important such as in sharing semantics. A hyper-program may therefore 
display a range of failure modes from not failing to failure from the hyper-link 
being no longer valid. 

The hnal issue is how hyper-links are typed, if at all. We will assume that 
for the present that they are. The interesting aspect of type checking is that 
the contract between the program and the referenced value may now take on a 
different agreement procedure. Instead of the program asserting the type of the 
hyper-link and the type checking system ensuring that the hyper-link has the 
correct type when it is used, the reverse may be used. That is the hyper-link 
knows its own type and therefore when it is used the program can be made to 
conform to this type. Statically this removes the need for type specihcations for 
hyper-links in hyper-programs and dynamically it means that the program may 
be in error rather than the hyper-link. 

This paper reviews our work on hyper-programming, discusses the advan- 
tages of the technique and proposes some current research areas. These include 
presenting a single representation of data and code throughout the software pro- 
cess; adapting hyper-programming to persistent contexts that do not enforce 
referential integrity, such as the WWW; and implementing and using hyper- 
programming in standardised languages and inter-operability mechanisms. 

2 Motivations &: Previous Work 

Our work on hyper-programming is motivated by a belief that programming lan- 
guage systems could provide better support for the software engineering process 
than they do at present. In particular, consider the traditional compose-compile- 
link-execute cycle of program development as illustrated in Fig. 2. 

In precis, a program is composed using a text-editor; compiled using a com- 
piler, which may also link in other source text; linked with other pre-compiled 
code; and hnally executed where it may link to persistent data such as hies. Dur- 
ing execution, other tools such as symbolic debuggers and run-time browsers may 
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Fig. 2. The Traditional Compose-Compile-Link-Execute Cycle 



be used to inspect the running state of the program. Thus there are four main 
processes: composition, compilation, linking and execution each with their ap- 
propriate tools such as text-editors, compilers, linkers, debuggers and browsers. 
Each tool operates on a particular translated version of the program such as 
source text, object code or executable code. 

There are two obvious questions that may be asked about the compose- 
compile-link-execute cycle. They are: 

• why are there so many processes and translated forms of the program? and 

• what level of detail should the user see? 

For the systems programmer the processes and translated forms provide the 
necessary level of control over the cycle. The translated forms allow common 
tools, such as optimisers, to be used even where the original forms are from 
disparate sources. The processes are necessary for manipulating the translated 
forms. 

From the applications programmers point of view, the processes and trans- 
lated forms often constitute noise in the execution cycle and a distraction from 
the task of constructing the system. Modern programming environments, such 
as Code Warrior [2] , attempt to hide this level of detail from the applications pro- 
grammer. Hyper-programming is a further step in this direction and the paper 
explores how effective the concept can be in different environments. 



2.1 Constructing Hyper-Programs 

The primary motivation for hyper-programming is to allow the user to compose 
programs interactively [3, 4], navigating the environment and selecting data 
items, including code, to be incorporated into the programs. This removes the 
need to write access specifications for extant data items that are used by a 
program. For example, in a file system it may be a path name, and in a persistent 
object store it may be a path to an object from a root of persistence. 
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Our first attempts at constructing a hyper-programming system were con- 
ducted in the Napier88 persistent programming environment. The strongly typed 
persistent object store guaranteed referential integrity of the hyper-links. Exist- 
ing languages that allow a program to link to persistent data items, including 
files, at any time during its execution require it to contain code to specify the 
access path and type for each data item. The access path defines how the data is 
found by following a particular route through the persistent store starting from 
a root of persistence. The type specifies the expected type of the data at that 
position. When a program is compiled the compiler checks that subsequent use 
of the data is compatible with this expected type. When the program is executed 
the run-time system checks that the data is present at the declared position and 
that it does have the expected type. 

This mechanism gives flexibility because a program can link to data in the 
store at any time during its execution. However in many cases the programmer 
knows that a particular data item is present in the store at the time the program 
is written and the programming system could obtain all the information in the 
access specification by inspecting the data item at that time. 

In a hyper-programming system the programmer has the option of linking 
existing data items into a program by pointing to graphical representations 
rather than writing access specifications. There are two advantages to this early 
composition-time linking. Firstly, errors that may occur in programs due to the 
access specification being invalid at the time of execution are completely avoided. 
This may occur where the store topology has changed and the access path no 
longer exists, even if the object does; where the object has been deleted; or where 
the object has been replaced by one of a different type. In all cases the contract 
between the program and the persistent store has been broken and the program 
may not execute safely. 

In the hyper-programming system the hyper-link is direct to the object and 
is guaranteed to be valid, at the time of the program execution, by the persistent 
stores referential integrity. Thus if the topology of the store changes, the link 
will still be valid; the object may not be deleted since the hyper-program still 
has access to it; and it may not change its type. 

Fig. 3 shows an example of the user interface that might be presented to the 
user by a hyper-program editing/browsing tool. The editor window (top-left) 
contains embedded buttons representing hyper-program links; when a button is 
pressed the corresponding object is displayed in a browser window (lower region). 

The hyper-links to persistent values are placed in the hyper-program by se- 
lecting each value with the store browsing tool and then pressing the Link button. 
In Napier88, the system asks the programmer whether to link the program to the 
value itself or to the store location that currently contains the value. The editor 
then inserts the link at the current text position, represented by a light-button. 

2.2 Safety and Efficiency 

Hyper-programming can provide improved safety in several ways. One of these 
is that it allows some program checks to be performed earlier than normal, sub- 
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Fig. 3. User Interface to a Hyper-Program Editor 



sequently giving increased assurance of program correctness. This is possible 
because data items accessed by a program may be available for checking be- 
fore run-time. Referential integrity then ensures that the checked data remains 
available at run-time. 

Checking can be performed at several stages in the program development 
process in existing systems. The principal opportunities are at compilation-time 
when a program is translated into an executable program, and at run-time when 
the executable program is executed. Categories of checking include checking 
programs for syntactic correctness and type consistency, and checking persistent 
data access. 

Checking Persistent Data Access. In conventional strongly typed persistent 
systems a program contains an access specification for each persistent data item 
used. These access specifications are checked at run-time: at that time the system 
verifies that each data item is present in the store, with the previously declared 
access path and type. 

A program execution will fail if the store does not contain a route to a data 
item corresponding to the access path specified in the program. Thus even if it 
is known at the time of writing that a particular program will execute correctly, 
it cannot be predicted when it may fail on some future execution. 

The use of hyper-programs as source representations allows the checking of 
access specifications to be performed before run-time. Each link in a hyper- 
program denotes a data item that exists in the store at the time the hyper- 
program is composed. The process of checking the access path is moved from run- 
time to program composition time. The access path is established incrementally 
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as the programmer manipulates the graphical representations of the data in the 
store to locate the required data item. Once the path has been established the 
data item at the end of it is linked into the hyper-program and the path need 
not be followed again at execution time. The hyper-pro- gram will be unaffected 
if the access path is then removed. 

The access path part of the access specification is established during hyper- 
program composition. The other part, the type specification of the data item, is 
checked when the type consistency of the hyper-program is verified at or before 
compilation-time. The system checks that the type of the data item denoted by 
the link is compatible with the use of the link in the program. 

Creating direct links from a hyper-program to values in the store, with the 
attendant safety benefits described above, is only applicable where values are 
present in the store at hyper-program composition time. Added flexibility can be 
gained by using links to denote mutable locations in the store. Linking a location 
into a hyper- program involves the same processes as for linking a value, with the 
difference that the value associated with the link changes when the location is 
updated. Updates to the location may occur at any time after the composition 
of the hyper- program. Strong typing ensures that the type of any value assigned 
to a location is compatible with the type of its original contents. This allows the 
type checking of persistent locations to be performed at compilation-time. The 
values in locations associated with the links in a hyper-program can vary but 
their types will always remain compatible. Where a link denotes a location, that 
location is linked directly into the executable program produced from the hyper- 
program, so that updates to the location also affect the executable program. 



2.3 Experience 

The benefits of hyper-programming described in [1, 3, 4] may be summarised as: 

• being able to perform program checking early 

• support for source representations of all object closures 

• being able to enforce associations from executable programs to source pro- 
grams 

• availability of an increased range of linking times 

• increased program succinctness 

• increased ease of program composition 

3 Current Work 

3.1 Options for Further Development 

Hyper-programming as described in the previous section is implemented in 
NapierSS [5] and using a persistent form of Java, PJama [6]. Both implemen- 
tations are based on the use of a closed-world, single-language, programming 
environment. The principle advantage of this is the degree of control that can be 
exercised over the data and code within the environment. In particular, a type 
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system can be enforced over the entire lifetime of the data and code, and referen- 
tial integrity can be guaranteed by the environment implementation. Thus, once 
established, a reference between two components will never become accidentally 
invalid. 

The use of such an environment offers various benefits, as discussed previ- 
ously, at the cost of limiting flexibility. There are thus two main avenues for 
further development of the hyper- programming concept: 

• to further pursue the benefits of using a closed-world system, accepting the 
limitations that this implies; and 

• to investigate how far the closed- world restrictions may be relaxed to increase 
flexibility, while retaining at least some of the original benefits of hyper- 
programming. 

Sections 3.2 to 3.4 describe three areas of research based on a closed-world 
platform: hyper-code, in which a single uniform representation of code and data is 
presented throughout the programming life-cycle; support for application evolu- 
tion based on tracking relationships between system components using referential 
integrity; and statically checkable dependant types. Some other areas in which 
a closed-world could be exploited, although not discussed further here, include: 

• version control, configuration management and documentation systems [1]; 
and 

• debugging, profiling and optimisation [7]. 

Sections 3.5 and 3.6 examine two ways in which the hyper-program platform 
constraints may be usefully relaxed: constructing programs over an unreliable 
network such as the World Wide Web; and hyper-programming using commer- 
cially significant languages and inter-operability standards, such as C++ [8], 
CORBA [9], DCOM [10] etc. 

3.2 Hyper-Code 

One of the original motivations for persistent programming was to remove the 
conceptually unnecessary distinction between short-term and long-term data 
[11]. This was followed by the recognition that code and data can usefully be 
treated in a uniform way [12]. Hyper-programming itself involved a further uni- 
fying step in which source programs themselves became persistent data, along 
with the compilers, editors and other tools with which they were manipulated 
[4] . There has thus been a progression of attempts to encompass ever more of the 
disparate entities that comprise a Persistent Application System (PAS) within 
a unified framework. 

Visual interaction with persistent data, such as that provided by generic 
object browsing systems [13-19], has proved to be a convenient and natural way 
for database users to address informal queries over the contents of a database. 
The users of such tools can browse freely around the data structures and values 
of a database, avoiding the necessity to write down algebraic expressions to 
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perform the equivalent accesses. Where appropriate it is also possible to perform 
updates or invoke more complex methods over the objects depicted on the screen. 
Such tools are greatly preferred to a traditional query-based approach for simple 
queries and updates to persistent data such as held in object-oriented databases. 

The advantages of this style of access are comparable to the advantages of a 
modern iconic operating system interface over a traditional command-line based 
approach. In addition, however, a more general programming algebra is required 
so that more complex and longer-running queries may be handled. This rather 
frustratingly gives rise to two quite separate mechanisms for manipulating the 
same values within a system, with the choice of mechanism being somewhat 
arbitrary for tasks in the middle ground between trivial and complex. 

Current work on hyper-code aims to complete the progressive integration of 
PAS entities [20], by presenting the programmer with a single representation 
form for all code and data throughout all stages of the programming process. 
These stages include at least object store browsing, program construction, ex- 
ecution, debugging and maintenance. The single representation form is based 
on source code, the argument being that all other forms of code and data are 
used for pragmatic implementation-driven reasons, rather than being conceptu- 
ally necessary. Since the representation must be able to accommodate closures, 
by necessity it is a hyper-program form that can include direct links. 

Hyper-code provides the basis for a new style of editor that includes three uni- 
fying concepts, the combination of which makes the editor the only mechanism 
that is required for interaction with the database system. The three important 
unifying concepts are: 

• Data of any type supported by the system may be browsed and edited in a 
uniform manner. This includes a uniform treatment of procedure closures; 
a drawback of previous browsers is that they could not adequately handle 
procedures. 

• Source code is treated not as a fundamental building block within the pro- 
gramming system, but instead as a transient text-based view of a value. The 
source does not have a conceptual permanent existence within the system, 
but is apparently generated from any value that may be browsed. 

• As a further consequence of the generic treatment of procedure values and 
source code, the artificial distinction between source and executable values 
within a running system is completely removed. 

The major difference between this and other browsers is therefore in the 
uniform treatment of the executable and source code forms of procedures, and 
hence programs. Furthermore, the manipulation of code made possible by the 
unification strategy is sufficiently general to subsume the usual process of pro- 
gram editing, compilation and linking which is normally associated with the 
manipulation of code bodies within a system. In constructing a program, the 
programmer writes hyper-code. During execution, during debugging, when a run 
time error occurs or when browsing existing programs, the programmer is pre- 
sented with, and only sees, the hyper-code representation. Thus the programmer 
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need never know about those entities that the system may support for reasons of 
efficiency, such as object code, executable code, compilers and linkers. These are 
maintained and used by the underlying system but are merely artifacts of how 
the program is stored and executed, and as such are completely hidden from the 
programmer. 

A consequence of the above is that the hyper-code editor is the only interfac- 
ing tool required to perform queries of any complexity against the database, or 
to introduce new data and program to it. The programmer may thus concentrate 
on the inherent complexity of the application rather than on that of the support 
system. 

Hyper-Code Operations. The previous hyper-programming implementations 
in Napier88 [21] and Java [19] approach this ideal, but fall short in two ways. 
Firstly, the programmer is aware of a distinction between the source and com- 
piled versions of code entities; and secondly, code and data entities are manipu- 
lated differently, using an editor and an object browser respectively. Hyper-code 
removes these distinctions. In the first case, the occurrence of system activities 
such as compilation and linking is hidden, since they are implementation details 
— the view presented to the programmer is one of source level interpretation. 
In the second case, all interaction with the hyper-code system is via a single 
hyper-code editor that fulfils the functions of both the browser and editor in the 
previous systems. The hyper-code editor supports only the following operations: 

• evaluate: this executes a selected fragment of hyper-code and returns the 
result, if any, as a new hyper-code fragment; 

• explode: this expands a selected link in a hyper-code fragment to show more 
detail, which is itself expressed in the form of hyper-code; 

• unexplode: this contracts an exploded link back to its original form; 

• edit: this includes all conventional editing facilities; 

• get root: this returns a selected persistent root, as a hyper-code fragment. 

When composed, these operations are sufficient to support all program con- 
struction, execution and persistent object browsing activities. Note that various 
system activities are implicit in the operations. For example, the implementation 
of the evaluate operation involves syntax checking, compilation and invocation 
of the selected code representation. 

The semantics of the hyper-code operations can be defined in terms of four 
abstract operations, which are reflect, reify, execute and transform. As shown in 
Fig. 4, these operate on two distinct domains: the domain of persistent hyper- 
code entities and the domain of hyper-code representations. The former domain 
contains all of the first class values defined by the programming language, to- 
gether with various non-first-class entities for which it may be useful to have 
representations, such as types, classes and executable code. Only the latter do- 
main, that of hyper-code representations, is made explicit to the programmer. 

The reflect and reify abstract operations simply map between the hyper-code 
entities and their representations. The execute operation takes place within the 
hyper-code entities domain: it involves the execution of an executable entity, 
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Hyper-Code Entities Domain Hyper-Code Representations Domain 




Fig. 4. Hyper-Code Domains and Abstract Operations 



potentially with side-effects on the domain. Correspondingly, the transform op- 
eration takes place within the representation domain, involving the manipulation 
of hyper-code representations. The hyper-code operations can be understood in 
terms of the abstract operations as follows: 

• evaluate first reflects a hyper-code representation to a corresponding hyper- 
code entity. If that entity is executable it is executed. If the execution pro- 
duces a result entity, or if the original entity is non-executable, that entity 
is reified to produce a result representation. 

• explode and unexplode both reflect a hyper-code representation to a corre- 
sponding hyper-code entity, and then reify that entity to produce a more or 
less detailed result representation, respectively. 

• edit involves transformation of an existing or null hyper-code representation 
into a new representation. 

• get root involves reification of a hyper-code entity to produce a representa- 
tion. 

It should be stressed that the abstract operations are purely definitional: 
only the hyper-code representations domain and the hyper-code operations are 
visible to the programmer. 

Hyper-Code Representations. The operations and domains described in 
the previous section may be applied to an implementation of hyper-code in any 
suitable language. The precise form of the hyper-code representation (HCR) will 
vary depending on the syntax of the chosen language, but will be guided by the 
following criteria that will apply for all languages: 

• The HCR must accommodate new programs written in the normal way. This 
implies that the representation must include pure text as a special case. 

• The HCR must support hyper-program links, for the reasons already dis- 
cussed. 

• The HCR must support detailed views of linked entities, to arbitrary levels 
of detail, in order that the hyper-code editor may subsume the functions of 
an object browser. 

• Since there must only be a single HCR, the detailed views of entities must 
themselves comprise text and hyper-program links in the same form as could 
be constructed by the programmer. 





Current Directions in Hyper-Programming 327 



• Furthermore, the detailed views should be self-contained and syntactically 
valid. Thus, for any detailed view of an entity, it should be possible to copy its 
representation, paste this into a new window, and evaluate it without error. 
The result of this evaluation will depend on the semantics of the language. 

Currently we have designed HCR forms for PJama and ProcessBase^, and 
have implemented a prototype in PJama. Fig. 5 shows an example in Process- 
Base, in which unexploded links to values are denoted by rounded white rect- 
angles, and unexploded links to types by rounded black rectangles. Exploded 
links are denoted by shaded rectangles, with the internal details depending on 
the particular entity. The example shows the definition of a procedure newPer- 
son, which takes a name and an age as parameters, and returns a view (record) 
containing them and a unique id number. The id is obtained by calling another 
procedure to increment a shared location, and then dereferencing that location. 



let newPerson <- fun (newName 
begin 



string 



newAge 



int ) 



fun ( ) ; I loc ( ( ) ) : = ' ( ] + 1 I { ) 



view (name <- newName; age <- newAge; id <- ' [ ] ) 

end 





view [name 


string 


age, id : ] 





Fig. 5. Example of Hyper-Code Representation in ProcessBase 



Our HCR design for PJama, an example of which shown in Pig. 6, is similar 
to that in Fig. 5, although it is less elegant due to the higher number of non-first- 
class entities to which it must support linking, and the presence of non-public 
object fields. 




Fig. 6. Example of Hyper-Code Representation in PJama 



^ A simple persistent language being developed as part of the Compliant Systems 
Architecture project [22, 23]. 
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3.3 System Evolution 

Hyper-programming is also the basis for providing new solutions to the problem 
of schema editing which requires location and translation of affected queries and 
data [24], The essential elements are at hand in the hyper- programming system. 
The schema may keep a record of which programs (queries) and data are associ- 
ated with particular parts of the schema via secure links. The programs always 
have hyper-program source and therefore source code and data translation is 
possible. The schema evolution mechanism transforms the programs and data 
affected by a schema edit. This is achieved as follows: 

• Locate, from the schema, all affected programs and data. 

• For each program which may be affected, obtain its hyper-program. 

• Locate the points in the hyper-program which access the changed part of 
the schema and edit the hyper-program to reflect the new logical schema 
structure. This will involve establishing new links both to and from the 
changed part of the schema. 

• Update the old program with the new one. 

• Update the affected data with new versions. 

The extent to which this process can be automated depends upon the com- 
plexity of the schema change incurred. The essential point is that all interrogation 
and manipulation of schema, program and data occurs within a single integrated 
environment, and may therefore be represented as a meta- level program within 
that environment. 

The mechanism relies heavily upon the self-contained nature of the persis- 
tent environment. As all the data and code is held in the same environment as 
the schema, it is possible to keep not only links from the schema to the data it 
describes but also reverse links from the schema to programs which bind to par- 
ticular points of it. The hyper-programming concept makes it possible to map 
between executable and source representations. The fact that these representa- 
tions are themselves values within the persistent environment, along with the 
provision of a compiler in the same environment, makes this strategy possible. 



3.4 Dependent Types 

In addition to data access checking as described in Section 2.2, language systems 
also perform other kinds of checking at run-time, some of which can be performed 
earlier in a hyper-programming system. An example of this is dependent type 
checking [25]. 

A dependent type is a type that depends on a value. In general this requires 
dynamic type checking. To determine whether two dependent types are compat- 
ible, the languages type checker takes account of the associated values as well as 
their structure. An example of a dependent type is the generic type map [26], 
instances of which are associations between sets of values. The type of a par- 
ticular map is dependent on the identity of the procedure that defines equality 
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over the key set. Because of this it is not generally possible to type-check at 
compilation-time a program that contains map operations, as the map values 
themselves must be tested. 

In a hyper-programming system the value on which a dependent type depends 
may be linked directly into a program, and may thus be available for checking at 
compilation-time. This makes it possible for the system to check operations on 
dependent types at compilation-time rather than planting code in the executable 
program to perform the checking at run-time. The system may also provide tools 
that allow the programmer to verify the type compatibility of selected values 
before they are linked into the hyper-program. 

More generally the programmer may perform arbitrary checks on data val- 
ues before linking them into a hyper-program, by writing and executing other 
programs that compute over them. If the checks succeed, the code that performs 
the checking can then be omitted from the main hyper-program, since the links 
to the original values are guaranteed to remain intact. 

3.5 Internet Programming 

The potential association between the concept of hyper-programming, and the 
Web, is obvious. The source format of hyper- programs is similar to hyper- 
text, and the Web provides a well-known hyper-text system over the global 
autonomous network. The clear appeal, therefore, is to somehow extend the 
paradigm to make it work in this context. 

This appeal, however, is fraught with serious technical difficulty, and it would 
be over-ambitious and pre-emptive to attempt to document it fully in this paper. 
We therefore restrict the discussion to an elaboration of the problems involved, 
and outline strategies which we believe may eventually provide solutions. 

Problems exist in the following categories: 

• how can program source be represented? 

• how can typed data be integrated with the http protocol? 

• how could data deriving from other web sources be integrated in a typed 

computation? 

• how can the potential failure of references be made tolerable? 

These topics are currently under investigation within the framework of the 
Hippo project at the University of Glasgow^. Here we describe only the direction 
taken for further investigation within each category. 

Program Source. To be properly compatible with the Web, it is necessary 
to represent hyper-programs in a standard text-based form. In the hyper-pro- 
gramming prototypes that have been built, program source is represented in a 
proprietary format, manipulated only by specially written editor/browser soft- 
ware. This allows the presentation of the program source to the programmer 
to be strongly associated with the programming language definition. However, 

www.hippo.org.uk 
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to move to a standard internet treatment, the program source format must be 
open, textual, and ideally should be HTML itself. 

One of the known (and as yet largely unaddressed) problems associated with 
hyper-programming is how standard language treatments, such as the definition 
of typing and semantics, can be adapted to the hypertext domain. Widely used 
methodologies for formal definitions and proofs invariably rely upon a textual 
source representation; while we can claim properties for hyper-programs on a 
purely intuitive level, it is not clear how to proceed with elementary proofs 
within a derived system, to demonstrate beyond doubt that there is no flaw in 
the soundness of the derived language. 

Our proposed solution to these problems is to use a two-level language repre- 
sentation and definition. At a high level, humans can interact with a hypertext 
source, whereas at a lower level the program is actually represented in HTML, 
including a standard use of hypertext anchors to represent hyperlinks within pro- 
grams. This allows standard HTML tools, such as high-level composition tools 
and browsers, to be used as a human-readable interface over the low-level repre- 
sentation. The low-level representation, using standard HTML, allows text-based 
protocols to be used to interpret and transport the HTML. 

The difficulty with such an approach is how to define the overall system 
in a manner which gives a clear and formal definition of its semantics. The 
overall system will be relatively complex, in comparison with existing hyper- 
programming systems where an intuitive semantics is relatively acceptable, given 
that the low-level representation is not patent. 

One approach to this problem is based on the definition of the two-level pro- 
gramming algebra using linguistic reflection as a language definition technique. 
This approach is based upon the use of compile-time reflection, as defined in [27]. 
A subset of HTML may be defined as the core programming algebra, making it 
possible to define the semantics of both standard language features and hyper- 
links. A hyper-text view of programs, as may be presented by both specialist 
program editors and standard browsers, can be defined (using the terminology 
of [28]) as a reflective sublanguage, which is used to generate the HTML-based 
textual form during static analysis by the programming system implementation. 

Using linguistic reflection as a definitional mechanism gives a well-defined for- 
mal framework in which hyper-programs can be described using relatively con- 
ventional definition techniques. Furthermore, it gives a framework wherein the 
core definition of hyper-programs is text-based, thus allowing their transporta- 
tion around the various text-based protocols of the Internet without resorting 
to ad-hoc translation techniques. 

Typed Data. Given a persistent programming language which can be used 
to program over embedded URLs, the next step is to consider how a URL can 
be used to refer to typed data, even supposing that the URL refers to data 
generated by the same programming system. The problem in turn decomposes 
into three further issues. These are: 

• unifying the global persistent namespace with those namespaces used in the 

Web; 
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• unifying the representation of the typed persistent data with that commonly 
used on the Web, namely HTML; 

• introducing type system mechanisms which allow the integration of remote, 
unreliable, and autonomous data with an otherwise static type system. 

Each of these presents significant technical challenges, and is not further ex- 
panded in this context. Interested readers are referred to [29] for a more detailed 
exposition of the approach taken; once again, solutions to these problems are 
still beyond our grasp. 

Importing Data. The full potential of a web-based hyper-programming system 
would only be met if it were possible to include links to data which had been 
generated by some system other than the particular programming language in 
use. Once again, this is an enormous issue and can not be addressed in this 
short space. There are two simple solutions: the first is to read the data as 
text or MIME, and restrict the typing of such links according to its transmitted 
classification. This results in a type safe language, assuming the consistent use 
of the protocol, but does not really address the spirit of the problem. The other 
simple solution is to publish the format used for the systems own typed data, 
and ensure it is possible to generate that externally. However any serious uptake 
of this system then requires the retrospective adoption of a new data standard, 
which is unlikely to succeed. 

The more ambitious goal is to attempt to analyse arbitrary data resulting 
from an http request for appropriate structural content and, if it is suitable, 
integrate it into a typed computation. The outline of our approach is for the 
programmer to specify a required type for the binding during the composition 
process. The URL is duly fetched, and translated into a semi-structured format 
according to a number of ad-hoc rules^. Having achieved a semi-structured rep- 
resentation of the data, the programmers asserted type is used to derive a subset 
of the data which corresponds to the same structure. This data is extracted and 
incorporated into the ongoing computation. An estimation of how well the data 
fits the expected type is also generated, and may be either returned to the user 
of the program or used within the running program. 

Although we have evidence that the outline given above is possible to engi- 
neer [30, 31], and furthermore gives a viable and understandable programming 
system, each of the steps described presents its own major problems and the 
production of such an integrated programming system is still beyond current 
understanding. 

Internet Hyper-Programming? In summary, there is a clear and easy in- 
tuition that an extension of the hyper-programming paradigm to encompass 
the global hyper-text concepts of the World-Wide Web will result in a power- 
ful distributed programming paradigm. While we believe that this is the case, 

^ The ad-hoc nature of this part of the process can be entirely circumvented when the 
document is XML, which we perceive to be a rapidly emerging standard for Web 
information. 
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on deeper inspection the technical issues underlying such a paradigm shift are 
profound. A great deal of work remains to be done before we can be convincing 
that the extended concept is feasible, whilst retaining a sound and disciplined 
programming system. 



3.6 An Open C-| — |-/DCOM Hyper- Programming Environment 

In this section we report on an attempt to apply the hyper-programming model 
in the context of an open system. We chose a DCOM/C++ system for the exper- 
imentation for a number of reasons. Firstly, both C+-|- [8] and DCOM [10] are 
being used by a large number of programmers to build systems in the real world. 
Secondly, having programmed with DCOM and C++, we felt there was a high 
degree of accidental complexity associated with this style of programming that 
was not intrinsic in the problem domain. We hoped that hyper-programming 
might be used to simplify the construction of DCOM programs. Finally we 
were influenced by the HIPPO work of Connor [29] and sought to discover if 
C++/DCOM programs could be written which had the same flavour as Hippo 
programs. If this was possible, the power of the many C++ libraries and envi- 
ronments could be used cheaply construct Web utilities. In addition to creating a 
hyper-programming environment for a commercial system, a deliberate attempt 
was made to maximise the use of freely available software and to avoid writing 
new software whenever possible. 

Hyper-Program Construction. A DCOM/C++ hyper-program is construc- 
ted using two tools: a text editor and a binder. These are used to specify the 
hyper-program text and the hyper links respectively. The output from these tools 
is fed into a pre-processor which unifies the source and the links into standard 
C++ prior to presentation to the gnu-C++ compiler. The pre-processor also 
creates files and directories for cache maintenance and in some circumstances 
pre-fetches Web pages. 

Editing Environment. The first tool requirement was for a text editor capa- 
ble of incorporating hyperlinks and suitable for editing programs. Web editing 
tools such as Netscape Composer and FrontPage do not support the editing of 
programs since they are intended as HTML composition tools. Consequently 
Emacs [32] was used with a (then) freely available extension called Hyperbole 
[33]. Hyperbole supports the inclusion of hyperlinks into documents. In particu- 
lar, these links can refer to Uniform Resource Locators (URLs), i.e. Web pages, 
and can be clicked on with the mouse. A Hyperbole user works with buttons 
embedded within textual documents. These buttons may be created, modified, 
moved or deleted. Each button performs a specific action, such as linking to a 
file or executing a shell command. Fig. 7 shows a C++ hyper-program being 
edited with the Emacs/Hyperbole environment. 

The Hyper-Program Source Code. The program shown in Fig. 8 contains a 
C++ /DCOM hyper-program that finds the telephone number of a member of the 
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Fig. 7. Emacs and Hyperbole 



Computer Science Department at Glasgow University. It does this by scanning an 
HTML page denoted by the hyperlink telephonedirectory. The program creates 
a binding denoted by h of type IHTML* to this Web page. The IHTML class 
shown in Fig. 9 supports a number of operations including the findJnJine 
method which searches lines of the page looking for the sub-string specified in 
the parameter. If a match is found the line is returned. It also contains a predicate 
at-end indicating that the end of the page has been reached. 



void main (char** argv, int argc) 

BOOL end = FALSE; 

BOOL is_found = FALSE; 

OLECHAR *line; 

IHTML*h = < (telephonedirectory) >; 
while (SUCCEEDED (h->at_end(&end) ) && ! end) { 
if ( (SUCCEEDED (h->find_in_line( 

argvll .feline ,&is_foimd) ) fefe 

is_found)) { 

printf ( "Details are ’/.s \n\r", line ); breeik; } 
if ( FAILED (h->next_line 0 ) ) break; 

} 

if ( end ) printf ( "didnt find ’/.s", argvll]); 



Fig. 8. A C++/DCOM Hyper-Program 
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interface IHTML : lunknown 
■[ 

HRESULT display_line() ; 

HRESULT openURL([in, string] char* filename); 

HRESULT next_line() ; 

HRESULT f ind_in_line ( [in, string] char* name, 

[string, out] OLECHAR** line, [out] int* isfound) ; 
HRESULT at_end( [out] int *i) ; 



Fig. 9. MIDL Definition of the IHTML Interface 



The code shown in Fig. 8 is standard DCOM/C++ except for the line, 
IHTML*h = <(telephonedirectory)>; 

which has to be replaced with standard C++, as described above this task is 
performed by the pre-processor. The code sequence into which this hyper-link is 
expanded depends on the binding style specified in the binder. This is described 
the next section. 

Creating Bindings. Using the Hyperbole environment, bindings can be made 
to any Web based data. However, this does not address the need to specify 
attributes associated with those links such as programming language type, ex- 
ternal data type, the location of the data being bound and binding time. To 
allow hyper-programmers to specify and view bindings, a Web interface to a 
binder has been created and is shown in Fig. 10. 

The binder permits users to specify a name for a hyper-link. This is used 
to match the hyper-links entered in the editor with bindings specified in the 
binder. The next field is the type of the object in the programming language 
context. In the current implementation this field contains a string which is used 
to specify the programming language type of the target object. This field is 
strictly unnecessary since it could be automatically generated but makes the 
generated code more readable. The next field, IID, is used to specify the type 
(interface) of the object being linked to. In the example shown in Fig. 10, the 
link is to an object of type IHTML, shown in Fig. 9. The CLSID field is 
used to specify a class library containing executable code implementing the class 
specified in the IID field. For DCOM aficionados, this is used to find by a class 
moniker to locate the class object. The URL field specifies the location of the 
data to which the link refers. 

The last field is used to specify the time at which the binding is resolved. 
There are currently two options supported: compile time and run time. These 
settings change the behaviour of the pre-processor and cause different code to 
be generated. When the compile-time option is chosen the pre-processor pre- 
fetches a copy of the target and stores it locally. In this case the code generated 
contains fewer run-time checks since the data will always be accessible. When 
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run-time binding is employed, failure at runtime is possible and consequently 
the generated code needs to be more sophisticated. The code generated for the 
example program shown in Fig. 8 is given in the next section. 




Fig. 10. Entering Details into the Binder 



Binding Times and Errors. The code generated depends on the binding time 
specified in the binder. Fig. If shows a slightly simplified version of the code 
generated for the hyper-program shown in Fig. 8 if construction time (eager) 
binding is specified. This code assumes that the binder has loaded the Web page 
into the local cache (home/sag/cache). The dynamic case is similar but requires 
additional code to fetch the page across the network. The code generated is 
straightforward DCOM code. 

void mainCint argc, char** argv) 

{ 

OLECHAR *line = 0; 

IHTML* h = 0; 

IClassFactory *pcf = 0; 

HRESULT res = S_0K; 

IMoniker *pmk = 0; 

IBindCtx *pbc = 0; 

Check(CreateBindCtx(0,&pbc) , "CreateBindCtx failed"); 

CheckCCreateClassMoniker (CLSID_CWeb0bj ect , &pmk) , 
"CreateClassMoniker failed") ; 

Check(BindToDbject (pbc , 0, IID_IClassFactory , 

(void**)&pcf ) , "BindToObj ect failed"); 

Check(pcf->CreateInstance(0, IID_IHTML, (void**)&h) , 

"Create Instance failed"); 
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Check(h.->openURL("/home/sag/cache/www. dcs . gla. ac .uk/ 
contact/index. html") , "Open URL failed"); 
BOOL end = FALSE; 

BOOL is_found = FALSE; 

while (SUCCEEDED (h->at_end(&end)) && ! end) { 
if ( (SUCCEEDED(h->f ind_in_line ( 



argv[l , feline ,&is_found) ) fefe is_found)) ■[ 

printf ( "Details are "/,s \n\r", line ); break; } 
if ( FAILED (h->next_line 0 ) ) break; 

> 

if ( end ) printf ( "didnt find ’/.s", argv[l]); 
h->Release() ; 
pcf->Release () ; 

> 

Fig. 11. Simplified DCOM Code Generated for Fig. 8 

Future Directions. All the examples and screen shots discussed this far de- 
scribe a system that has been implemented at the University of Stirling. However, 
this code represents the start rather than the end-point of what we are trying to 
achieve. We stated earlier that we were seeking an integration of C++/DCOM 
with hyper-programming and the ideas embodied in the Hippo system. We now 
describe how we can use what we have implemented to date to achieve this. 



void main (chair** argv, int argc) 

{ 

BOOL end; 

IPersonSet *s = <(telephonedirectory) > ; 
while (SUCCEEDED (s->at_end(feend) ) fefe lend) { 
Person person; 

if (SUCCEEDED(s->next_person(feperson) fefe 
! strcmp (person. name , argv [1] )) ) { 

printf ("Telephone number of '/,s is °/,s\n" , 
argvfl] ,person.phone_no) ; 
break; 

}■ 

} 

if (end) printf ("didn’t find '/,s\n" , argvfl] ) ; 

> 



Fig. 12. A Strongly Typed C++ Hyper-Program 
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The program shown in Fig. 8 treats the Web data as an HTML file not as a 
typed entity. We would like to be able to re-write the hyper-program as shown in 
Fig. 12. In this example, rather than treating the data as HTML text, we have 
typed it as a set of objects of type Person. This requires a number of refinements 
to the mechanisms already implemented. First the HTML file must be typed as 
a set of Person. To achieve this, a MIDL interface definition of a set of Person 
is created as shown in Fig. 13. This type is structurally similar to the I HTML 
interface given earlier with the line type being replaced with records of type 
Person. Since the IPersonSet interface inherits from I HTML, it may use the 
I HTML interface to assist in the extraction of records of type Person from the 
text file. 



typedef struct { OLECHAR *name; OLECHAR *phone_no; 

OLECHAR *nickname; }■ Person; 

interface IPersonSet ; IHTML 

HRESULT next_person( [out] Person* current ) ; 

> 



Fig. 13. MIDL Definition of Person Set Interface 



Some mechanism must be provided to convert the textual data retrieved 
over the Web into typed objects (in this case of type Person). This task is 
encoded in the library providing the implementation of IPersonSet. Whilst this 
implementation could be hand coded, a more desirable approach would be to 
generate it automatically from a specification. There are two basic approaches 
to this: (i) use the MIDL as a specification for the Web format and (ii) use the 
Web format as a specification to generate the MIDL. 

If the first approach were employed, a tool could be engineered which took 
the MIDL interface and a URL as parameters and attempted to find records of 
the appropriate type in the file. In the case of the URL used in the examples in 
this Section, the fields are all comma separated making this task easy. This is 
similar to the construction of indices in database systems and the importation 
of records using Wizards in Microsoft Excel and Access. Once the index was 
created, generic code could be used to traverse the data and return records 
each time next-person was called. An alternative approach is to generate the 
IDL from the Web source. This approach is particularly attractive if the Web 
source is encoded in a structured or semi-structured manner, for example, using 
XML [34]. In both cases, generic code needs to exist which may be specialised 
to operate over records of an appropriate type. This may be achieved using the 
parametric polymorphism provided by the implementation language or using 
tools such as those suggested by Sheard and Stemple [35] or Kirby [36] . 
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4 Conclusions 

Our original motivation for hyper-programming was to allow the user to compose 
programs interactively, navigating the environment and selecting data items, 
including code, to be incorporated into the programs. We further believed that 
programming language systems could provide better support for the software 
engineering process than they do at present, in particular, with regard to the 
traditional compose-compilelink-execute cycle of program development. From our 
early implementations of hyper- programming we summarised that the attendant 
benefits of the concept are: 

• being able to perform program checking early 

• support for source representations of all object closures 

• being able to enforce associations from executable programs to source pro- 
grams 

• availability of an increased range of linking times 

• increased program succinctness 

• increased ease of program composition 

Here we have developed the hyper-programming notion to presenting a single 
representation of data and code throughout the software process using hyper- 
code. Furthermore we have explored techniques for adapting hyper-programming 
to persistent contexts that do not enforce referential integrity, such as the WWW; 
and implementing and using hyper-programming in standardised languages and 
inter-operability mechanisms, such as C++ and DCOM. 
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Abstract. Traditional database systems use ACID properties (Atom- 
icity, Consistency, Isolation and Durability) to implement recovery and 
concurrency control. However, this implementation is not always appro- 
priate in distributed real time systems and in systems with long-lived 
transactions. For example, long-lived transactions may be active for days, 
and at the same time other transactions may need access to data, locked 
by the long-lived transactions. Therefore, extended transaction models 
have been developed. These transaction models only implement semantic 
ACID properties. That is, from an application point of view the system 
should function as if the traditional ACID properties were implemented. 
Multi user word processing, CAD and CASE systems may both be dis- 
tributed and have long-lived transactions. Therefore, extended transac- 
tion models may be useful in Computer Supported Cooperative Work 
(CSCW), where users work with shared data. In this paper we will try 
to integrate the research in extended transaction models with the CSCW 
research, which for many years have been aware of the shortcomings of 
the traditional ACID properties. In the transaction model in this paper 
the global atomicity property is implemented by combining the possi- 
bilities of either forcing the remaining updatings of a transaction to be 
executed or compensating the already executed updatings of the trans- 
action. The global consistency property may be managed by the CSCW 
system and/or by human beings supported by tools. The global isola- 
tion property is implemented by using countermeasures to the missing 
isolation of the updating transactions. The global durability property is 
implemented by using the durability property of the local CSCW /DBMS 
systems. In the extended transaction model described above we will in- 
corporate some of the most promising CSCW commit /isolation features 
known from the scientihc CSCW literature. 

Keywords: CSCW, distributed groupware, collaborative writing, se- 
mantic ACID properties, concurrency control, long-lived transactions 
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1 Introduction 

CSCW systems may be grouped in synchronous and asynchronous groupware 
systems. 

In synchronous groupware systems all modifications can be observed real-time 
by all members of the collaboration. These WYSIWIS (What You See Is What I 
See) systems (Stefik et ah, 1987) do not have a well-defined transaction concept, 
and, therefore, the ACID properties of such systems are not will defined either. 
Anyway, synchronous systems do have consistency problems, and, therefore, the 
tools of our transaction model may improve the situation. 

In asynchronous groupware systems (e.g. Koch, 1995 and Jones, 1995) a user 
may first modify his/her local version of the database/document. When the 
modifications of the user are ready to be published to the other users, a global 
updating transaction is executed, and in this situation the semantic ACID prop- 
erties of our transaction model may be important. 

In synchronous groupware systems traditional locks can normally not be 
recommended as they slow down the real time interaction of the users. In asyn- 
chronous systems locking cannot be recommended either, when some of the 
transactions are long-lived (Gray and Reuter, 1993). The problem is that lock- 
ing long-lived transactions exclude other users from making updatings, and this 
may not be acceptable. Therefore, traditional locking is normally not used in 
CSCW systems, which for many years have used different countermeasures that 
can reduce the problems occurring when traditional locking cannot be used for 
concurrency control. 

The objective of this paper is to illustrate how to integrate different com- 
mit/isolation protocols to facilitate the selection of the right combinations of 
properties /tools for a CSCW system in a specific application area. 

The paper is organized as follows: Section 2 will describe the transaction 
model used in this paper, i.e. we will give an overview of how the global semantic 
ACID properties can be implemented. In section 3 we will illustrate how to 
integrate different commit /isolation protocols for CSCW systems. Concluding 
remarks are presented in section 4. 

Related work: The systematic analysis of countermeasures, described in 
Frank and Zahle (1998), was not possible until the isolation property was de- 
composed into disjunctive isolation anomalies by Gray and Reuter (1993) and 
Berenson et al. (1995). 

For many years, extensive research has been made in CSCW systems with 
shared data in order to bypass the problems of traditional concurrency control. 
(For example Ellis and Gibbs, 1989; Pacull et ah, 1994; Koch, 1995; Jones, 1995 
and Salcedo et ah, 1997). This paper may be viewed as a supplement to this 
field of research, where we use the disjunctive consistency problems of Gray 
and Reuter (1993) to describe in more detail the properties of the different 
commit/isolation protocols. 

The commit /isolation protocols may be described by rules. Therefore, the 
commit/isolation protocols may be implemented by using the flexible CSCW 
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systems as described in e.g. Georgakopoulos et al. (1994) and Rusinkiewitz et 
al. (1995), where the rules of the transactions are defined by there activity type. 
In other words, it is possible to change the commit /isolation protocol by changing 
the activity type of the transactions. 

2 The Transaction Model 

In the following, we will give an overview of how the global semantic ACID 
properties are implemented in our transaction model. 



2.1 The Atomicity Property 

An updating transaction has the atomicity property and is called atomic if either 
all or none of its updatings are executed. In this paper we use the single pivot 
transaction model (Mehrotra et ah, 1992; Zhang et ah, 1994 and Frank, 1999) 
for atomicity implementation. In this transaction model the global transaction 
is partitioned into the following types of subtransactions that are executed at 
different locations: 

1. The pivot subtransaction that manages the atomicity of the global transac- 
tion, i.e. the global transaction is committed globally when the pivot subtrans- 
action is committed locally. If the pivot subtransaction aborts, all the updatings 
of the other subtransactions must be compensated or not executed. 

2. The compensatable subtransactions that all may be compensated. Compen- 
satable subtransactions must always be executed before the pivot subtransaction 
is executed in order to allow them to be compensated if the pivot subtransaction 
cannot be committed. Compensation is achieved by executing a compensating 
subtransaction. 

3. The retriable subtransactions that are designed in such a way that the 
execution is guaranteed to commit locally (sooner or later) . Retriable subtrans- 
actions are executed after the local commit of the pivot subtransaction, because 
they have the pivot subtransaction as parent and are initiated by the pivot 
subtransaction. 

Example 

When a primary copy of an object is updated, created or deleted, the sec- 
ondary copies may be updated with global atomicity by using retriable sub- 
transactions. Suppose all users in a CSCW system have their own local 
workspace copy of a database, where a primary copy of the database is used 
to serialize and distribute the updating transactions. In this situation an 
updating user can send compensatable subtransactions to the other users 
via the primary copy location. All the updatings of the compensatable sub- 
transactions must be marked as compensatable. If the other users can accept 
the updatings from the compensatable subtransaction, they send an accept 
message to the primary copy location. If the primary copy location receives 
accept messages from all the involved users, a pivot subtransaction can com- 
mit the updatings globally by committing the updatings in the primary copy 
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location. After this, retriable subtransactions initiated by the pivot subtrans- 
action are sent to all the users to de-mark the compensatable mark of the 
updatings committed at the primary copy. The same retriable subtransac- 
tions may also try to upgrade any other compensatable marked updatings 
to the new object version. 



2.2 The Consistency Property 



A database is consistent if the data in the database obeys the consistency rules 
of the database. Consistency rules may be implemented as a control program 
that rejects transactions, which do not obey the consistency rules. 

In CSCW systems consistency rules may be managed by the CSCW system 
if they are described and initiated by a user (See e.g. Decouchant et ah, 1996). 



2.3 The Isolation Property 



A database where all the transactions have the consistency property may still 
be inconsistent, if the isolation property is missing. A transaction is executed 
with the isolation property if the updatings of the transaction only are seen by 
other transactions after the updatings of the transaction have been committed. 

In our transaction model the global semantic isolation property is managed by 
using countermeasures against the isolation anomalies that occur when transac- 
tions are executed without the isolation property. In designing countermeasures 
it is possible to use local locking, but all locks should be released immediately 
after a subtransaction has been committed/aborted locally in order to avoid 
blocked data (Data is blocked if it is locked by a subtransaction that loses the 
connection to the parent transaction). 

If the isolation property is not implemented, four different types of isolation 
anomalies may occur. And if none of these isolation anomalies can occur, the 
execution of the transactions is serializable (Gray and Reuter, 1993 and Berenson 
et. ah, 1995). In the following we will describe the tree isolation anomalies that 
are important in CSCW systems: 

1. The lost update anomaly is by definition a situation where a first 
transaction reads an object for update without using locks. Subsequently, 
the object is updated by another transaction. Later, the first transaction 
(based on its earlier read value) updates the object and commits. 

In our transaction model all the local users have there own copy of the 
database and reading/updating the local database copy functions as reads 
for update without locks. In such a situation it is possible for conflicting 
transactions to update the same object, and only the updating of the last 
transaction will survive. 
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2. The dirty read anomaly is by definition a situation where a first trans- 
action updates an object without locking the object or committing the up- 
date. After this, a second transaction reads the object. Later, the first up- 
date is aborted (or compensated). In other words, the second transaction 
has read a version of the object that was never committed and therefore 
never really did exist. 

In our transaction model the dirty read anomaly may happen when the 
first transaction updates an object by using a compensatable subtransac- 
tion that is distributed to all the local databases of the users. Later, these 
distributed updatings are removed by using compensating subtransactions. 
If a local user reads the object before it is compensated, the data read will 
be dirty and may result in a wrong decision. 

3. The non-repeatable read anomaly or fuzzy read is by definition a sit- 
uation where a first transaction reads an object without using locks. This 
object is later updated and committed by a second transaction before the 
first transaction has been committed. That is, if the first transaction rereads 
the object, the attributes of the object are changed. In other words, the sec- 
ond transaction may read something that is not true when the transaction 
commits, and this may result in a wrong decision. 

In our transaction model this may happen when the first transaction 
reads an object in the local copy of the database. Later the same object may 
be updated by a retriable subtransaction without the local user noticing the 
update, which may cause the local user to make wrong decisions. 



2.4 The Durability Property 

Transactions have the durability property if the updatings of the transactions 
cannot be lost after they have been committed. For global atomic transactions 
the global durability property will automatically be implemented, as it is ensured 
by the durability of the local databases (Breibart et. ah, 1992). 

3 Integration of the Commit/Isolation Protocols 

In major projects group structures, roles, and activities may change during a 
project. Therefore, according to e.g. Koch (1995) and Jones (1995), it should be 
possible to change the commit /isolation procedure while the project is running. 

All the commit /isolation protocols described in this section have a precisely 
defined commit time, after which an update decision cannot be annulled auto- 
matically. This is practical from an implementation point of view, and it also 
suits most structured working situations. However, working groups (and individ- 
uals) do not always work in a structured way, and, therefore, it may be important 
to be able to undo already committed updatings. In this situation it is practical 
to have a common transaction model (like our transaction model) to manage the 
transaction back out independent of the commit/isolation protocols used by the 
transaction that should be backed out. 
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In this section we will illustrate how to integrate our transaction model with 
some of the existing commit /isolation protocols described in the scientific liter- 
ature. 

3.1 The Reread Countermeasure 

Transactions that use this countermeasure (Frank and Zahle, 1998) read an 
object twice by using short duration locks for each reading. If a second trans- 
action has changed the object between the two readings, the transaction must 
abort itself after the second read. In asynchronous CSCW systems the reread 
countermeasure may be used to protect against the lost update anomaly in the 
following way: After a user has updated his/her local workspace, both the old 
version (or the version id.) and the new version of the changed objects are sent 
to the primary copy location, where the primary copy of these objects are read. 
If the primary copies of the objects are the same versions as the user’s old ver- 
sions, then the primary copy objects are modified to the user’s new versions. 
Otherwise, the updatings of the user are rejected, and the committed primary 
copy version of the objects may be displayed for the user in a special color as a 
”non-repeatable read” warning. Later, the user may upgrade his/her updatings 
to the new object versions and retry to submit the updating transaction. In 
real time WYSIWIS systems it may be very confusing if different users delete, 
change and/or move the same sentence/figure element independently (Greenberg 
and Marwood, 1994). In this situation the reread countermeasure can prevent 
the problem in the following way: At first, new updatings are executed at the 
location of the updating user as compensatable updatings. Later, a pivot sub- 
transaction updates the primary copy if it is unchanged. Finally, the committed 
updatings of the pivot subtransaction are propagated to the other users. How- 
ever, if the primary copy vas changed by another user, the pivot updatings are 
rejected and compensated in the location of the updating user. 



3.2 The Version Tree Protocol 

If different parallel versions of an object exist, they may be implemented by a 
version tree (Koch, 1995), where the different parallel versions are children of 
the same parent object. The following example illustrates how version trees may 
be integrated in our transaction model. 

Example 

Insertion of a new subtext (character string) into an object is implemented 
as a new object, which is a child of the original object. In other words, 
different transactions can create different versions of the same parent object 
by storing different child objects related to the same parent object. The 
child objects are identified by the id of the parent object in combination 
with the id of the updating transaction (and possibly a sequence number, 
if the updating transaction creates many child objects). A field value in 
the child object marks the insertion as ’’compensatable” if the insertion is 
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not committed globally. A compensatable insertion can easily be committed 
globally by de-marking the corresponding ’’compensatable” mark. However, 
in this situation other compensatable marked updatings to the same object 
must be upgraded to the new version of the object as described in the next 
subsection. A computer program can do this, but the upgraded transactions 
should be marked with ” non- repeatable read anomaly” until a human brain 
has accepted the upgraded insertion as a semantic correct insertion to the 
new version of the object. If a human cannot accept the upgraded insertion, 
the corresponding transaction must be compensated. 



3.3 The Operational Transformation Protocol 

The objective of the distributed OPperational Transformation (dOPT) Algorithm 
described in Ellis and Gibbs (1989) is to implement concurrency control in real 
time groupware systems. The algorithm was first implemented in the GROVE 
system (Group Outline Viewing Editor) described in Ellis et al., 1990 and 1991. 
Later, the method of operational transformation has been improved in Nichols 
et al. (1995); Ressel et al. (1996); Sun et al. (1998) and Sun and Ellis (1998). 
Operational transformation prevents lost updatings by transforming a second 
conflicting updating to another type of updating that cannot overwrite the first 
updating. The GROVE system uses a conflict matrix that describes how each 
type of conflict in text updatings may be transformed. By using operational 
transformation the dirty read anomaly cannot occur either, because an aborted 
object is only known to the user who made the aborted updating. After a com- 
pensatable subtransaction has been committed globally, operational transforma- 
tion may be used to upgrade automatically other compensatable subtransactions 
to the new version of the object. Other upgrading techniques are described in 
e.g. Neuwirth et al. (1992). Operational transformation does not deal with the 
non-repeatable read anomaly. Therefore, other countermeasures may be used to 
prevent these anomalies (see e.g. subsection 3.6). 



3.4 The Linearization Protocol 

Linearization (Herlihy and Wing, 1990, and Pacull and Sandoz, 1993) is both 
a commit and an isolation protocol. The main idea of the protocol is that the 
possibility to read, update or annotate the central copy of a document is passed 
along from one to another on requests. When a user has his/her turn, it is possible 
to read new updated versions of requested central copy objects, and/or it is 
possible to overwrite the central copies of the objects with the user’s modified 
object versions. In the main version of this protocol the user only uses short 
duration locks. By integrating the reread countermeasure it is possible to prevent 
lost updatings. If another user has changed the central copy of an object, it 
should be possible to upgrade updatings to the new version. The dirty read 
anomaly cannot occur. The problems of the non-repeatable read anomaly may 
be prevented by rereading and control of all the data that has been changed since 
the last time the user had exclusive update rights. If this is done, the protocol 
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may produce serializable executions. However, this is not realistic, and, therefore, 
it is also important to integrate countermeasures against the non-repeatable read 
anomaly in the protocol. This protocol may be integrated in our transaction 
model in the following way: At first, new updatings are executed at the location 
of the updating user as compensatable updatings. Later, when the user has access 
to the primary copy, the pivot subtransaction is executed. Finally, the copies 
of the other users are updated by using retriable subtransactions. Altogether, 
we evaluate the Linearization Protocol and its possibilities for integration with 
other isolation countermeasures to be good. In our view, the main problem of 
this protocol is how to get the users to collaborate in such a way that they do not 
spoil each other’s updatings, when they have the updating rights. In DUPLEX 
(Pacull et ah, 1994), an implementation of the Linearization Protocol has solved 
the problem in the following way: 

• The document is decomposed into independently editable parts. 

• The decomposition is dynamic and based on document structure; it reflects 
both document state and each author’s current responsibility and involve- 
ment on different parts. 

• Authors are allowed to choose the type of control (exclusive, pessimistic, 
optimistic, etc.,) that they wish on the document parts they are concerned 
with. 

We believe that these rules are very important in order to manage most 
asynchronous groupware systems in a consistent way. Therefore, we recommend 
integrating these rules into the previous described asynchronous protocols wher- 
ever it is possible. 



3.5 The Read Uncommitted Protocol 

In this protocol we will use our transaction model in the following way: At first, 
new updatings are executed at the location of the updating user as compensat- 
able updatings. Later, a pivot subtransaction updates the corresponding primary 
copy, and if the primary copy is changed by another user, the pivot updatings 
are rejected and compensated in the location of the updating user. The primary 
copy of the database is used to serialize the updating transactions in order to 
prevent the lost update anomaly. However, this protocols accept both the dirty 
read anomaly and the non-repeatable read anomaly. The reason is that in CSCW 
systems with shared data it may be best to have access to ’’dirty” and ’’non- 
repeatable read” data as early as possible, because the alternative only allows 
access to ’’old information” , and old information may be very old if the updating 
transactions are long-lived. This protocol has very poor write availability if dif- 
ferent long-lived transactions want to update the same data. The protocol almost 
corresponds to the ’’read uncommitted” isolation level (ANSI, 1992), where write 
locks do not exclude reading transactions. However, the ANSI protocol does not 
deal with primary and secondary copies. The protocol has resemblance to the 
commit/isolation protocol of the SEPIA hypertext authoring system described 
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in Haake and Wilson (1992), because this system uses the real ’’read uncom- 
mitted” isolation level of the relational DBMS SYBASE. The main difference is 
that the users of SEPIA do not have their own database copy, but this is not 
a major difference when the users normally can read what they want as write 
locks do not exclude readings. By using SEPIA it is possible to use the ’’SEPIA 
Activity Spaces” for content, planning, argumentation, etc. as countermeasures 
against the other consistency problems. 



3.6 The Group Awareness Countermeasure 

The group awareness interaction and cooperation rules suggested in Koch (1995) 
may prevent the dirty read anomaly and the non-repeatable read anomaly. How- 
ever, group awareness may also have more social and innovative purposes than 
countermeasures against consistency problems. 

In tightly eoupled WYSIWIS systems (e.g. Haake and Wilson, 1992), where 
the users share the same view, an additional communication channel (e.g. au- 
dio/video links) is almost necessary in order to prevent consistency problems. 
In some situations Greenberg and Marwood (1994) recommend using the addi- 
tional communication channel to both prevent lost updatings and if a warning 
comes too late the additional channel may be used to repair the lost data. 

3.7 Conclusions 

This paper has illustrated how distributed semantic ACID properties can be 
implemented in distributed CSCW systems by using the single pivot transaction 
model and countermeasures against the different consistency problems that occur 
when only semantic ACID properties are implemented. 

It is not possible to select one protocol as the best, because some protocols 
are more suitable for large projects and others for small projects, etc. However, 
our analyzes of the different commit /isolation protocols have illustrated that 
countermeasures against lost updatings, and the rest of the isolation anomalies 
may be integrated in such a way that it is possible to tailor commit /isolation 
protocols for the different phases of a given project. We have also illustrated 
that it may be important to use a common transaction model for all the com- 
mit/isolation protocols supported by a CSCW software product, because this 
model allows the upgrade- and back out tools for transactions to be designed in 
such a way that they can accept changes in the commit/isolation protocol used 
in the different phases of a CSCW project. 
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Abstract. Spatial data models have been extensively studied during 
the last decade. However, requirements of a spatial database system re- 
gardless of any specihc application, have not received much attention. 

In this paper, a general Object-Oriented spatial data model is introduced. 
This model considers a spatial database system in general, without fo- 
cusing on specihc features or applications, and presents a new method 
for classihcation of spatial objects into maps. The concept of map as 
dehned here, is an appropriate dehnition for objects with arbitrary set 
of spatial components. This concept is similar to the one of a map in the 
real world. Map dehnition is followed by the dehnition of map hierarchy 
and operations on maps which can be used to answer queries that might 
be too complicated otherwise. 



1 Introduction 

Topics such as urban planning, land use, city and road planning have recently 
received much attention. The spatial data related to these applications have 
specific features such as high volume and complex structure. Modeling spatial 
data is a basic step in designing a spatial database system. 

Research which has been carried out so far, mostly consider specific features 
of a spatial database system [6,9], or discuss spatial data modeling from the 
point of view of a specific application [3]. Furthermore, most of the database 
systems which have been designed for spatial purposes have been built above 
the relational approach [7,8]. However, this approach is not powerful enough to 
be used as a basis for spatial database systems. 

A recent approach is to build a spatial database system around an Object- 
Oriented paradigm [11,4,10]. Object-Oriented benefits comply with the require- 
ments of spatial systems. In the literature, the only object categorization that 
has been considered is the classification of objects into classes and arbitrary 
categorization of spatial properties has received no attention. 

This paper presents a general Object-Oriented spatial data model without 
considering any particular features or requirements of a specific application. It 
introduces the concept of map as an arbitrary class of spatial objects and defines 
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operations on map. The categorization of objects into maps allows us to create 
a structure of data in an efficient hierarchical way, define operations such as 
Join and Zoom on maps which have significant effect on the usability of the 
database system, and reply a wide range of queries. The benefits of Object- 
Oriented paradigm provide high flexibility for the data model. Furthermore, the 
data model is general enough to be used as the basis for a multipurpose spatial 
database system. 

This paper is divided into 5 sections. The following section introduces spatial 
objects and object hierarchy. Section 3 explains concept of map and the partial 
order between maps. Operations on maps are given in section 4. Section 5 sum- 
marizes and concludes the paper. 

2 Spatial Objects and Their Hierarchy 

Objects in a spatial database system have spatial and descriptive (no n-spatial) at- 
tributes. Descriptive attributes might be numbers, character strings or booleans. 
Various types of spatial attributes have been defined in the literature[6,ll]. We 
employ the main types point, line and region. 

The smallest definable spatial attribute is a point which can be represented 
by its coordinates in the Euclidean plane. Given two distinct points pi and p 2 , 
a line segment is defined which connects the two points. A connected graph 
consisting of a set of line segments is defined as a line. A region is defined as 
a set of continuous Points. 

The order relation on spatial data is defined as follows. 

Definition 1. Let P,L and R be spatial attributes with types Point, Line and 
Region, P <\ L and L <\ R 

We assume that a spatial object is an object with only one spatial attribute. 
This assumption simplifies the definition of spatial operations. For more compli- 
cated cases where various spatial attributes have to be assumed, the concept of 
map will be considered. 

Definition 2. An object is a spatial object if it has one and only one spatial 
attribute from types point, line or region. 

Various operations on spatial data have been defined in the literature [11,6]. 
We have defined a set of spatial operations that could be found in the full paper. 

Definition 3. Two spatial objects 0\ and O 2 are identical (0\ « O 2 ) if and 
only if their spatial attributes have the same value. 

The following function returns as a result, the spatial attribute of a spatial 
object. 

Definition 4. Let 0 be a spatial object and S be its spatial attribute, SA(0)=S. 
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2.1 PART-OF Hierarchy 

Part-whole relation has already been studied in detail[2,12]. We study this rela- 
tion from the point of view of a spatial data model. PART-OF relation between 
spatial objects is defined as follows: 

Definition 5. Let Oi and O 2 be spatial objects with Ri and R 2 as their spatial 
attributes, Oi PART-OF O 2 iff Ri C i? 2 - 

We recognize four various interpretations for PART-OF relation: 

— We say object 0\ is a part of object O 2 such that whole requires part and we 
write 0\ WRP — PO O 2 if O 2 can not exist without 0\ (e.g. Water-Storage 
is WRP - PO City). 

— Object 0\ is a PRW — PO (part requires whole) part of object O 2 if Oi 
can not exist without being part of O 2 (e.g. a Movie-Theater is PRW — PO 
City). 

— A PART — OF relation is called strong S — PO if it is WRP — PO and 
PRW — PO (e.g. City-Government is S' — PO City). 

— A PART — OF relation is called weak W — PO if it is neither WRP — PO 
nor PRW — PO (e.g. Gold-Mine is W — PO City). 

2.2 IS-A, PART-OF Interrelationship 

Figure 1 displays the general form 
of interrelationship between IS-A and 
PART-OF relations. However this inter- 
relationship does not hold for all types jg 
of PART-OF. 

The following cases of interrelationship 
between IS-A and PART-OF relations 
can be derived from Figure 1. In what 
follows represents class membership 
and denotes subclass relation. 

1. Vg, s, r {r y. s y r ■. s) /\ {q WRP — PO s) 

3 p {{p :: q V p : q) A {p WRP — PO r)) . 

2. y p, q, s {p :: q y p : q) A {q PRW — PO s) ^ 

3 r { {r s y r ■. s) A (p PRW — PO r)). 

3. Cases 1 and 2 also hold for strong part of (S' — PO). 

3 Object Categorization 

A hierarchy of objects is produced by defining spatial objects and relations IS-A 
and PART-OF on them. Object Country, for instance, can have objects City, 
River, Road, Lake, Sea and Mountain as its parts. The PART-OF hierarchy, 
arranges the above set of objects in a hierarchical order. 

A map is defined as a specific type of spatial object which might have other 
spatial objects as its components. 



PART-OF 

q s 

IS-A 

]3 

PART-OF 

Fig.l. IS-A, PART-OF 
interrelationship 
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Definition 6. A map is a spatial object such that there is at least one other 
spatial object related to it by a PART-OF relation. 

Map m consisted of PARTs 0\, On is represented as m(Oi, On)- 

Definition 7. A spatial object that is not a map(has no parts) is called a simple 
spatial object. 

3.1 Partial Order of Spatial Objects 

The set of all spatial objects is partially ordered and the relation < is defined 
based on level of detailed information that is contained in each object. Par- 
tial order is defined recursively. In the following, CR is correspondence relation 
between two objects. 

Definition 8. Let A and B be spatial objects 
A < B tff 
{A^B)^ 

(A is a simple spatial object V VO PART-OF A 30' PART-OF B, 

CR(0,0') A O < O') 

(A^B)^ 

(A is a simple spatial object A SA(A) < SA(B)) 

Figure 2 displays partial order between maps. 

Partial equality of spatial objects is defined as follows, 

Definition 9. Let A and B be spatial objects, 

A = B ijfA <B A B<A. 




ml 



m2 



m3 



Fig.2. Partial ordering of maps 



4 Operations on Maps 

Once the concept of map is defined, operations can be introduced to manipulate 
maps. Some of the operations which have been already defined on objects can 
be expanded to maps and some new operations can be introduced as well. A set 
of map operations has been formally defined and will appear in the full paper. 
Zoom and Join operations will be discussed here in brief. 

Join operation creates a new map by joining two adjacent maps. The opera- 
tion may accept specific conditions to determine if adjacent (||) objects from the 
same type must be unified into one or may remain separate. 
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Definition 10. (Join): Let mi(Oi, O 2 ) •••i 0„) and m2{0[,02, ■■■,0'f.) be two 
maps sueh that Oi : Ti, O 2 : T 2 , On : and 0[ : T[, O 2 : T 2 , O'f, : T( and 

30i{l <i <n) 30'(1 < j < k), Ti = T' A Oi |[ O', 



Unconditional Join: mi + m 2 = m(Oi, 0„, O^, O^). 

Conditional Join: (mi ®m 2 )oi,o'. = 

mc(0i,...,0i_i,0i+i,...,0„,0(,...,0'_i,0'_^i,...,0(.,0c) where Oc = 0i©0'. 





Fig.3. Join maps ml and m2 
on Riverl and River2 



Zooming operation on maps is defined 
based on our definition of partial order 
of maps. Since maps can be recursively 
defined, a map can be consisted of other 
maps. Zooming a map on one of its com- 
ponents will return as a result the next 
detailed level of that component from 
the hierarchy of partial order. The zoom 
can be continued until the last (most 
detailed) level of hierarchy has reached. 



For instance, two maps West- 
Germany and East-Germany can be 
joined into one map called Germany. 
If no conditions are considered, city 
objects West-Berlin and East-Berlin 
will remain as separate objects in the 
new map, however by the condition 
to join the two country maps over 
the two city objects, they will also 
be joined into one city. Eigure 3 
displays an example of conditional 
Join. 




Fig.4. Zoom of map m on o4 



Definition 11. (Zoom): Let mi{0i,02, ■■■,On) and m2{0[,0'2, ■■■,0'^) be two 
map such that mi < m 2 , Zoom{mi)oi = 



Figure 4 illustrates zoom operation. 



5 Conclusion 

A general Object-Oriented spatial data model was presented that has the poten- 
tial to model various types of data related to a spatial database system. Object 
hierarchy PART-OF and the interrelationship between PART-OF and IS-A hi- 
erarchies were defined. Considering spatial objects with more than one spatial 
attributes has drawbacks such as complicated process and flat(non hierarchi- 
cal) structure of objects. Therefore, we assumed a spatial object with only one 
spatial attribute and defined the concept of map for arbitrary classification of 
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objects with spatial attributes of various types. Definition of map and PART- 
OF relation between maps create a hierarchy of data similar to the hierarchy 
between objects in the real world. This hierarchy is introduced as partial order. 
Another benefit of using maps is the possibility to reuse basic spatial objects in 
forming maps. Partial order between maps provides the basis for a formal defi- 
nition for zooming process. Zoom is one of the specific and crucial features of a 
spatial database system that up to now has not been formally defined. However, 
introducing a formalism for zoom in modeling phase enables us to specifically 
determine the portion of data that should be displayed in every step of zoom- 
ing. Data security in spatial database systems is a very important concern. To 
build a secure spatial database system, data must be carefully structured in a 
hierarchical way. The formalisms presented in this paper for map hierarchy and 
partial order are rich enough to handle issues related to data security. 

Join operation on maps was defined. Map operations such as join are applied 
on maps and act on the map and its components at the same time. In another 
words, an operation on a map will be recursively applied on its components. The 
designed data model is a rich collection of database formalisms, conventions and 
operations. 
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Abstract. We introduce an object-oriented design pattern called Twin 
that allows us to model multiple inheritance in programming languages 
that do not support this feature (e.g. Java, Modula-3, Oberon-2). The 
pattern avoids many of the problems of multiple inheritance while keep- 
ing most of its benefits. The structure of this paper corresponds to the 
form of the design pattern catalogue in [GHJV95]. 



1 Motivation 

Design patterns are schematic standard solutions to recurring software design 
problems. They encapsulate a designer’s experience and makes it reusable in 
similar contexts. Recently, a great number of design patterns has been discovered 
and published ([GHJV95], [Pree95], [BMRSS96]). Some of them are directly 
supported in a programming language (e.g. the Prototype pattern in Self or the 
Rerator pattern CLU), some are not. In this paper we describe a design pattern, 
which allows a programmer to simulate multiple inheritance in languages which 
do not support this feature directly. 

Multiple inheritance allows one to inherit data and code from more than one 
base class. It is a controversial feature that is claimed to be indispensable by 
some programmers, but also blamed for problems by others, since it can lead to 
name clashes, complexity and inefficiency. In most cases, software architectures 
become cleaner and simpler when multiple inheritance is avoided, but there are 
also situations where this feature is really needed. If one is programming in a 
language that does not support multiple inheritance (e.g. in Java, Modula-3 oder 
Oberon-2), but if one really needs this feature, one has to find a work-around. 
The Twin pattern — introduced in this paper — provides a standard solution 
for such cases. It gives one most of the benefits of multiple inheritance while 
avoiding many of its problems. 

The rest of this paper is structured according to the pattern catalogue in 
[GHJV95] so that the Twin pattern could in principle be incorporated into this 
catalogue. 

1.1 Example 

As a motivating example for a situation that requires multiple inheritance, con- 
sider a computer ball game consisting of active and passive game objects. The 
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active objects are balls that move across the screen at a certain speed. The 
passive objects are paddles, walls and other obstacles that are either fixed at a 
certain screen position or can be moved under the control of the user. 

The design of such a game is shown in Fig. 1. All game items (paddles, 
walls, balls, etc.) are derived from a common base class Gameltem from which 
they inherit methods for drawing or collision checking. Methods such as draw() 
and intersects() are abstract and have to be refined in subclasses. check() is a 
template method, i.e. it consists of calls to other abstract methods that must be 
implemented by concrete game item classes later. It tests if an item intersects 
with some other and calls the other item’s collideWith() method in that case. In 
addition to being game items, active objects (i.e. balls) are also derived from class 
Thread. All threads are controlled by a scheduler using preemptive multitasking. 




Fig. 1. Class hierarchy of a computer ball game 



The body of a ball thread is implemented in its run() method. When a ball 
thread is running, it repeatedly moves and draws the ball. If the user clicks on 
a ball, the ball sends itself a suspend() message to stop its movement. Clicking 
on the ball again sends a resume() message to make the ball moving again. 

The important thing about this example is that balls are both game items 
and threads (i.e. they are compatible with both). They can be linked into a list 
of game items, for example, so that they can be sent draw() and interseets() 
messages. But they can also be linked into a list of threads from which the 
scheduler selects the next thread to run. Thus, balls have to be compatible with 
both base classes. This is a typical case where multiple inheritance is useful. 

Languages like Java don’t support multiple inheritance, so how can we im- 
plement this design in Java? In Java, a class can extend only one base class but 
it can implement several interfaces. Let’s see, if we can get along with multiple 
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interface inheritance here. Ball could extend Thread and thus inherit the code of 
suspend() and resume(). However, it is not possible to treat Gameltem ]usi as an 
interface because Gameltem is not fully abstract. It has a method check(), which 
contains code. Ball would like to inherit this code from Gameltem and should 
therefore extend it as well. Thus Ball really has to extend two base classes. 

This is the place where the Twin pattern comes in. The basic idea is as follows: 
Instead of having a single class Ball that is derived from both Gameltem and 
Thread, we have two separate classes Ballltem and BallThread, which are derived 
from Gameltem and Thread, respectively (Fig. 2). Ballltem and BallThread are 
closely coupled via fields so that we can view them as a Twin object having two 
ends: The Ballltem end is compatible with Gameltem and can be linked into a 
list of game items; the BallThread end is compatible with Thread and can be 
linked into a list of threads. 



if (suspended) 
twin.resumeO; 
else 

twin.suspendO; 




while (true) { ^ 
twin.drawO; 
twin.moveO; 
twin.drawO; 

} 



Fig. 2. The class Ball from Fig.l was split into two classes, which make up a 
twin object 



Twin objects are always created in pairs. When the scheduler activates a 
BallThread object by calling its method run(), the object moves the ball by 
sending its twin the messages move() and draw(). On the other hand, when the 
user clicks on a ball with the mouse, the Ballltem object reacts to the click and 
sends its twin the messages suspend() and resume() as appropriate. 

Using only single inheritance, we have obtained most of the benefits of mul- 
tiple inheritance: Active game objects inherit code from both Gameltem and 
Thread. They are also compatible with both, i.e. they can be treated both as 
game items {draw, cliek) and as threads {run). As a pleasant side effect, we 
have avoided a major problem of multiple inheritance, namely name clashes. If 
Gameltem and Thread had fields or methods with the same name, they would 
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be inherited by Ballltem and BallThread independently. No name clash would 
occur. Similarly, if Gameltem and Thread had a common base class B, the fields 
and methods of B would be handed down to Ballltem and to BallThread sepa- 
rately — again without name clashes. 

2 Applicability 

The Twin pattern can be used 

• to simulate multiple inheritance in a language that does not support this 
feature. 

• to avoid certain problems of multiple inheritance such as name clashes. 

3 Structure 

The typical structure of multiple inheritance is described in Fig. 3. 




Fig. 3. Typical structure of multiple inheritance 



It can be replaced by the Twin pattern structure described in Fig. 4. 

4 Participants 

Parentl {Gameltem) and Parent2 {Thread) 

• The classes from which you want to inherit. 

Childl {Ballltem) and Child2 {BallThread) 

• The subclasses of Parentl and Parent2. They are mutually linked via fields. 
Each subclass may override methods inherited from its parent. New methods 
and fields are usually declared just in one of the subclasses (e.g. in Ghildl). 
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Fig. 4. Typical structure of the Twin pattern 



5 Collaborations 

• Every child class is responsible for the protocol inherited from its parent. 
It handles messages from this protocol and forwards other messages to its 
partner class. 

• Clients of the twin pattern reference one of the twin objects directly (e.g. 
ballltem) and the other via its twin field (e.g. ballltem.twin) . 

• Clients that rely on the protocols of Parentl or Parent2 communicate with 
objects of the respective child class { Childl or Child2). 

6 Consequences 

Although the Twin pattern is able to simulate multiple inheritance, it is not 
identical to it. There are several problems that one has to be aware of: 

1. Subclassing the Twin pattern. If the twin pattern should again be subclassed, 
it is often sufficient to subclass just one of the partners, for example Childl. 
In order to pass the interface of both partner classes down to the subclass, it 
is convenient to collect the methods of both partners in one class. One can 
add the methods of Child2 also to Childl and let them forward requests to 
the other partner (Fig. 5). 

This solution has the problem that Sub is only compatible with Childl but 
not with Child2. If one wants to make the subclass compatible with both 
Childl and Child2 one has to model it according to the Twin pattern again 
(Fig.6). 
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Fig. 5. Subclassing a twin class. Childl.M2() forwards the message to 
Child2.M2() 




Fig. 6. The subclass of Childl and Child2 is again a Twin class 
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2. More than two parent elasses. The Twin pattern can be extended to more 
than two parent classes in a straightforward way. For every parent class there 
must be a child class. All child classes have to be mutually linked via fields 

(Fig-7). 




Fig. 7. A Twin class derived from three parent classes 

Although this is considerably more complex than multiple inheritance, it is 
rare that a class inherits from more than two parent classes. 



7 Implementation 

The following issues should be considered when implementing the Twin pattern: 

1. Data abstraction. The partners of a twin class have to cooperate closely. 
They probably have to access each others’ private fields and methods. Most 
languages provide features to do that, i.e. to let related classes see more about 
each other than foreign classes. In Java, one can put the partner classes into 
a common package and implement the private fields and methods with the 
paekage visibility attribute. In Modula-3 and Oberon one can put the partner 
classes into the same module so that they have unrestricted access to each 
others’ components. 

2. Efficiency. The Twin pattern replaces inheritance relationships by composi- 
tion. This requires forwarding of messages, which is less efficient than inher- 
itance. However, multiple inheritance is anyway slightly less efficient than 
single inheritance [Str89] so that the additional run time costs of the Twin 
pattern are not a major problem. 
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8 Sample Code 

We sketch the implementation of the motivating example (a computer game 
board with moving balls) in Java. The board is represented by a class Game- 
Board. It has a certain width and height and a reference to a list of game items. 

public class Gameboard extends Canvas { 
public int width, height; 
public Gameltem firstitem; 

} 



The game items are derived from an abstract class Gameltem. Every item 
has a reference to the game board, a position on this board and a reference to 
the next game item. It has abstract methods to draw itself, to react on mouse 
clicks, to check whether it intersects with some other game item and to take 
measures for a collision with other game items. 

public abstract class Gameltem { 

Gameboard board; 
int posX, posY ; 

Gameltem next; 

public abstract void drawO ; 

public abstract void click (MouseEvent e) ; 

public abstract boolean intersects (Gameltem other) ; 

public abstract void collideWith (Gameltem other) ; 

public void checkO { . . . } 

} 



The method cheek() is a template method, which checks if this object inter- 
sects with any other object on the board. If so, it does whatever it has to do for 
a collision. 

public void checkO { 

Gameltem x; 

for (x = board. firstitem; x != null; x = x.next) 
if (intersects(x) ) collideWith(x) ; 

} 



Balls are twin objects derived from Gameltem and Thread. As shown in 
Fig. 2 we implement the twin group as Ballltem (a subclass of Gameltem) and 
BallThread (a subclass of Thread). Ball items move at a certain speed (dx,dy) 
and have to override the inherited methods draw, cliek, intersects and collide- 
With. 

public class Ballltem extends Gameltem { 

BallThread twin; 
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int radius; 
int dx, dy; 
boolean suspended; 

public void drawO { 

board . getGraphics () . drawOval (posX-radius , 
posY-radius, 2*radius, 2*radius) ; } 
public void moveO { posX += dx; posY += dy; } 

public void clickO {...} 

public boolean intersects (Gameltem other) { . . . } 
public void collideWith (Gameltem other) { . . . } 

} 



In order to simplify things, we assume that balls can only collide with walls, 
which are another kind of game items. The intersects method of a Ballltem can 
then be implemented as 

public boolean intersects (Gameltem other) { 
if (other instanceof Wall) 

return posX - radius <= other. posX 
&& other. posX <= posX + radius 
I I posY - radius <= other. posY 
kk other. posY <= posY + radius; 
else return false; 

} 



A collision with a wall changes the direction of the ball, which can be imple- 
mented as 

public void collideWith (Gameltem other) { 

Wall wall = (Wall) other; 

if (wall . isVertical) dx = - dx; else dy = - dy; 

} 



When the user clicks on a moving ball it stops; when he clicks on a stopped 
ball it starts to move again. This is implemented by suspending and resuming 
the corresponding ball thread (the twin object). 

public void clickO { 

if (suspended) twin.resumeO ; else twin, suspend () ; 
suspended = ! suspended; 

} 



The class BallThread is derived from the standard class java.lang. Thread. 
It has a reference to its twin class Ballltem. The only method that has to be 
implemented is run(). The implementation of other methods such as suspend() 
and resume () is inherited from Thread. 
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public class BallThread extends Thread { 

Ballltem twin; 
public void runO { 
while (true) { 

twin.drawO; /*erase*/ twin.moveO ; twin.drawO; 

} 

} 

} 



When a new ball is needed, the program has to create both a Ballltem and 
a BallThread object and link them together, for example: 

public static Ballltem newBall 

(int posX, int posY, int radius) {//method of GameBoard 
Ballltem ballltem = new BallltemCposX, posY, radius); 
BallThread ballThread = new BallThreadO ; 
ballltem. twin = ballThread; 
ballThread. twin = ballltem; 
return ballltem; 

} 



The returned ball item can be linked into the list of game items in the game 
board. The corresponding ball thread can be started to make the ball move. 

9 Known Uses 

The motivating example of a ball game (Section 1) was implemented as a teach- 
ing exercise in Oberon-2, a language that does not support multiple inheritance. 
The Oberon system uses cooperative multitasking. It maintains a list of user pro- 
cesses that are activated whenever the system is idle. A ball is a special instance 
of a process and at the same time a game object. 

Another example can be found in the context of Java applets. Applets are 
active objects that live on Web pages and react on user input such as mouse 
clicks. When a user clicks on an applet, the applet notifies all registered mouse 
listeners to react on the event. If an applet wants to react on the click itself, 
it has to implement the MouseListener interface, so that it can be registered 
as an appropriate listener with itself. It must also extend the class Applet. The 
following code shows the declaration of a class MyApplet: 

class MyApplet extends Applet implements MouseListenerj 

} ' 



The MowseTzsfener interface (a standard interface of the Java libraries) spec- 
ifies 5 methods that have to be implemented in MyApplet 
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interface MouseListener extends EventListener { 
public void mousePressed (MouseEvent event) ; 
public void mouseClicked (MouseEvent event) ; 
public void mouseReleased (MouseEvent event) ; 
public void mouseEntered (MouseEvent event) ; 
public void mouseExited (MouseEvent event) ; 

} 

Some of these methods are often identical in different listener implementa- 
tions. For example, several listeners change the shape of the cursor in the same 
way when it enters or exits the applet area on the screen. Therefore, we would like 
to have a prefabricated mouse listener class [StdMouseListener), which already 
provides standard implementations for the methods mouseEntered and mouse- 
Exited. Other listeners could then inherit these standard implementations. 

We are now in a situation where we would like to inherit code from two 
classes, namely from Applet and StdMouseListener, but this is not possible in 
Java. We can only inherit from one class. We can, however, apply the Twin 
pattern, which results in the following architecture (Fig. 8). 




Fig. 8. A twin applet that inherits code both from Applet and from StdMouseLis- 
tener 

MyApplet inherits code from Applet] MyAppletListener inherits code from 
StdMouseListener. A MyAppletListener object will be registered as a mouse 
listener for MyApplet. When it is notified about a mouse click it accesses its 
applet to perform an appropriate action. 

In [CaW98] a similar solution is presented using inner classes. MyAppletLis- 
tener is implemented there as an inner class of MyApplet. This allows MyAp- 
pletListener to access all private instance variables of MyApplet. No explicit link 
between the classes is necessary. However, this solution is asymmetric. MyApplet 
cannot access the private instance variables of MyAppletListener. 
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10 Related Patterns 

The Twin pattern is related to the Adapter pattern, especially to the Two- 
Way- Adapter described in [GHJV95], which is recommended when two different 
clients need to view an object differently. However, the Two- Way-Adapter is 
implemented with multiple inheritance while the Twin avoids this feature. 
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Abstract. We propose a formal semantics for object data models. Our 
approach may be seen as a semantic approach to object-relational mod- 
els. It is object-oriented because it captures the main concepts of object- 
oriented models namely : class, method, object identity, inheritance, col- 
lection types and persistence; it is relational because it maintain the main 
characteristics of the relational model, especially the clear separation be- 
tween schema, instance and querying. Moreover, it is functional in the 
sense that it is based on a simple algebra of partial functions whose main 
role is to perform arithmetic computations, similar to commercial lan- 
guages. Another important aspect of our approach is that it provides a 
rigorous mathematical treatment of null value. 

1 Introduction 

A database can be usually seen as a collection of records. The type of a record’s 
field is either a basic type as in the pure relational model [3], or a set of record 
types as in the nested relational model [17], or any combination of set and record 
types as in complex object models [2,9]. Some of these models also support null 
values [19]. But none of these models reflects the semantics of real world objects. 
On the other hand, object-oriented models claim to overcome this semantics 
problem. 

In object oriented models [1,11] records, called objects in the trade, have a 
special field and may have other additional fields. The value of the special field 
is assumed to give a unique identification of the object in a context. The context 
of an object is its class which is a named collection of objects with the same 
type. Classes are organized in an inheritance hierarchy. Other additional fields 
of records are computations (or methods). 

In the absence of a standard formal object data model, various models and var- 
ious query languages have been proposed [4,10,8,5]. The ODMG group carried 
out an effort of standardization and proposed an object data language ODL, and 
an object query language OQL [7]. However, no formal model has emerged with 
the same authority as the relational model, and no algebraic query language 
with the same elegance as the relational algebra. We think that existing propos- 
als contain enough material for defining a formal model and an algebraic query 
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language. In this paper we intend to go towards such a definition, and to a pro- 
vide rigorous mathematical treatment for all concepts of object-oriented models, 
namely attributes, methods, classes, inheritance, object identity, persistence and 
other related concepts. Thus, we don’t claim to introduce yet another model, but 
simply to give formal semantics for most common object concepts in different 
existing object models. In doing so, we follow the relational database tradition; 
namely the clear separation between schema, domain, instance and query. 

To carry through this objective we consider a class as a named collection of 
partial functions with the same domain. The result of each function on an object 
is a calculation or a value which is an element of some type. Type expressions 
are obtained from basic types and class names, using two constructors set and 
(8>. However, the semantics of 0 in this paper is not the usual cartesian product 
of types but a semantics more suitable for dealing with partial functions and null 
values. Our semantics for 0 provides a nice and rigorous treatment of null values 
in the object paradigm. This apparently slightly different way of seeing objects, 
however, leads to a different philosophy for their design and their manipulation. 
For example, our approach allows us 

— to provide a uniform representation of attributes, methods and inheritance; 

— to resolve neatly delicate problems of inheritance, namely multiple inheri- 
tance, overloading and renaming; 

— to deal with null values, a subject that most object data models do not treat 
explicitly; and 

— to define an algebraic semantics, based on a simple algebra of functions with 
few simple operations. 

In fact, the algebra of functions that we are using, can also serve for defining an 
algebraic query language. This subject however, is not treated in this paper due 
to space limitation. The interested reader is referred to [13] for a simple version 
of this query language. 



2 The Data Model 

2.1 Database Schema 

In what follows, by an inheritance relation over a set A, we mean a finite binary 
relation which is irreflexive and has no cycle. Clearly the transitive and reflexive 
closure of any inheritance relation is a partial order. An inheritance relation is 
represented either (1) as a finite subset if of A x A or (2) as a set-valued function 
R: X ^ Vf{X), where Vf{X) is the set of finite subsets of A. The correspondence 
between the two representation is x R y y € R{x). We use the same symbol, 
say if, for both representations, and we denote by R{x) the elements related 
to X via if. Similarly, given three sets X,Y and Z, we consider a finite ternary 
relation over (A, Y, Z) alternatively (1) as a finite subset if of A x Y x Y, or (2) 
as a function R-. X ^ Vf{Y x Z). 
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Definition 1 A ternary relation R over (X, Y, Z) is said to be XY -funetional to 
Z if it satisfies :Vx€X\/y^YVz€Z Vz' € Z {R(x, y, z) A R(x, y, z') z = z'). 

We begin with three enumerable non empty and pairwise disjoint sets A, A4 and 
C. Elements of these sets are called attribute names, method names, and class 
names respectively. Now, let C be a non empty finite set of class names, and (3 a 
non empty finite set of type names, that we shall call basic. We consider a type 
system in which types are built from (3 and C using two constructors 0 and set. 
That is, the set Tc of types is defined as follows : 

Tc::= f\C \ Tc(E)Tc\ set Tc 

Elements of Tc are called object-types (or types for simplicity). We omit the 
subscript C whenever there is no possibility of confusion. In the sequel, T^t 
will denote the set of non empty sequences of elements of Tc, and inheritance 
relations will commonly be denoted by isa. 

Definition 2 We say S = {C,isa,att,meth) is an object-oriented database 
schema (or a schema for short) if: 

— C is a finite non empty subset of class names, 

— isa is an inheritance relation over C , 

— att is a finite ternary relation over (C,A,Tc), which is C A- functional to Tc, 

— meth is a finite ternary relation over {C,M,Tq), which is C M-functional 
to Tq , 

such that for every c in C, one of the three sets isa{c), att{c) or meth{c) is not 
empty (i.e. Me ^ C att{c) Umeth{c) Uisa{c) ■ 



For each c in C, c = (c, isa{c), att{c),meth{c)) is called a class of S with name c. 
Each (a, t) in att{c) is called an attribute of c with name a and type t; each 
{m,t\t 2 . . .tk) in meth{c) is called a method of c with name m and profile 
t\t 2 . . .tk. The above definition implies that classes of a schema have distinct 
names. Thus, a class can be recognized by its name. We read c isa c! also as 
c inherits c' . The transitive and reflexive closure of isa is denoted by <isa- We 
read c <isa c' as c is a subclass of c' or c' is a superclass of c. 

Following definitions 1 and 2, the functionality of att and meth means that 
overloading of attributes or methods is not allowed within a class. However, this 
does not prevent overloading attributes or methods which are in distinct classes. 
Therefore, an attribute or a method is completely determined only in the context 
of a class and not intrinsically in the whole schema. In fact in the whole schema, 
an attribute (a, t) of c should be seen as (c, a,t) and a method (m, t\t 2 ...tk) of c 
should be seen as {c,m,tit 2 ...tk). 

Our definition of schema does not impose any restrictions on inheritance. Single 
inheritance as well as multiple inheritance are allowed. The last condition of 
Definition 2 says that if a class has no attributes and no methods, it must at 
least inherit another class, and if a class does not inherit any other class and has 
no methods (or attributes) it must have at least one attribute (or method). 
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According to the usual notation of the object-oriented paradigm, we shall denote 
an attribute (a, t) of a class hy a : t and a method (m, by m : ^ 

tk- Since is a non empty sequence, some methods have the form m : — ^ t 

(methods without parameters). Such a method can be regarded as a computed 
attribute. 

Our definition of schema and class is similar to the 
standard schema and class declarations of most 
object-oriented data models. Thus the standard 
declarations of the opposite Figure correspond in 
our setting to 

isa{c) ={ci,...,Cm} 
att{c) = {{ai,ti),. . . ,{an,tn)} 
meth{c) = {imi,tl...tlj, {mp,tl...tlj} 

A finite set of such declarations forms a schema iff 
the resulting functions isa, att and meth satisfy 
the conditions of Definition 2. 

The schema of Figure 1 will be our running example throughout the paper. In 
this example, the method bonus in class Prof needs an argument of type int 
(for example the number of students that he supervises), but the same method 
in Emp has no arguments. 




class Pers 
attributes : 

name : string 
ssn : int 

class Emp inherit Pers 
attributes : 
charge : string 
hiredjdate : int 
salary : int 
methods : 
bonus : int 

seniority : int int 

class Dir inherit Emp 
attributes : 
appoint jiate : int 
methods : 

seniority : int — > int 



class Prof inherit Emp 
attributes : 

supervise : set Stud 
teaches : set Course 
charge : set Proj 
methods : 
bonus : int — > int 



class Stud inherit Pers 
attributes : 

supervisor : Prof 
takes : set Course 



class Tutor inherit 
Stud, Emp 
methods : 

bonus : — > int 




Fig. 1. An example of schema 
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2.2 Inheritance and the Overloading Problem 

The inheritance hierarchy of the schema provides a mechanism allowing us to 
relate together properties of classes. More precisely, when c <isa c', each object 
of c may be seen as an object of c', thus properties of c' may be considered 
also as properties of c. But this consideration can cause a name conflict in c. 
For instance, in our example, the attribute charge : string in Emp gives the 
assignment of an employee, whereas the same attribute name in the class Prof 
corresponds to the attribute charge : set Proj, which gives the set of projects 
that a professor is in charge of. A similar name conflict will happen for methods 
seniority in Emp and seniority in Dir even if they have the same type. In fact 
the seniority of a person as a director may differ from his/her seniority as an 
employee. 

One way to avoid such name conflicts is to rename inherited attributes or inher- 
ited methods whenever conflict may arise. Since our next discussion will not vary 
if we talk about attributes or methods, we shall do the discussion for attributes 
only. The same results will be valid for methods. 

Definition 3 Let c and ci two classes of a schema of S such that c <isa c\. We 
say (ci,ai,ti) does not conflict with the class c if for every attribute (02,02,^2) 
of S, we have ci <isa C2 whenever c <isa C2 and ui = 02. ■ 

In particular, the reflexivity of <isa implies readily that an attribute of a class 
c does not conflict with c itself. In our example no attribute of the class Pers 
conflicts with the classes Emp and Prof. The attribute charge of Emp conflicts 
with Prof but does not conflict with Dir. Similarly, the method seniority of 
Emp conflicts with the class Dir. But the method bonus in Tutor does not 
create any conflict. Now, we can express our renaming procedure as follows : 

Renaming procedure : For every class c and every superclass c\ of c, if a 
property (i.e. attribute or method) p of c\ conflicts with c then rename p in c. 

Our formal way for renaming will be prefixing. For example if c <isa ci and if 
an attribute Oi : ( a method mi : t\...tk-i tk) of Ci conflicts with c then as 

an attribute (a method) of c it will be denoted (ci)ai : t\ ( (ci)mi : t\...tk-i 
tk) respectively. In practice, instead of prefixing names, new names may be 
introduced. We stress that our renaming procedure does not depend on the 
type of the property we have to rename; and the renamed property has the 
same type as the original one. As a consequence, our notion of inheritance does 
not impose any covariance or contravariance conditions. This is in contrast to 
the sub-typing in object programming languages in which the covariance or 
contravariance constraints enforce the type of a redefined attribute (method) 
in a subclass, to decrease or to increase respectively [1,6]. For more detail see 
Section 4.3. 
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3 The Type System 

3.1 Concrete Types 

For defining schema we have used types syntactically. Now, we need to know 
what is the meaning of a type. Let us add to our type system a new special type 
with one element, called unit. Thus : 

Tc ■:= fi \ C \ T (g) T \ set T T ::= Tc \ unit 

Each basic type name t is assumed to denote at most one denumerable set |t] 
that we will call concrete type of t. The concrete type of each class name is 
supposed to be a special denumerable set oid, which is disjoint from all other 
basic concrete types. The elements of oid are called object identities. Now, we 
consider a symbol _L and we assume that it denotes an element outside the 
concrete basic types and the concrete type oid. The concrete types of all other 
types are defined recursively as follows : 

- {unit} = {_L}, 

- [ti «) t2l = (|til X |t2l) + ([til X {unit}) + {[unitj x [t2l) 

- [set t] = Vf lltj) 

where X and + are the usual cartesian product and the usual cartesian coproduct 
of sets. Every element of a concrete type |t] is called a value of t. Values will serve 
to define the stored part of a database. It is important to note that the semantics 
of (8) is not the usual cartesian product semantics. However, since |ti 0 t2j is a 
disjoint union of products, we write an element of [ti (8)t2l as (ui, U2) where Vi or 
V2 (but not both) may be the symbol _L. This special semantics of 0 will allow 
us to deal with null values and later on with partial functions. Indeed, in the 
context of databases the special symbol _L can be seen as the value null. Then 
the above semantics of 0 expresses that some component of a tuple value can be 
null. Later on we shall see another role of the symbol _L which will express the 
undefinedness of functions. If we denote [t]x = [t] + |unft] then we can write 
down readily the following theorem which relates 0 to the cartesian product : 

Theorem 1 For all types t\ and t2, [ti]± x [t2l± and [ti<8)t2l_L are isomorphic.M 



3.2 Classes as Types 

In any type system, basic types come with operations. Usually, an abstract type is 
a named user-defined type. In our setting if 5 = {C, isa, att, meth) is a database 
schema then every class {c,isa{c),att{c),meth{c)) can be seen as a special ab- 
stract type. The name c of the class is the name and also a sort of this abstract 
type. Operations of c are then defined as follows : 

— if (c, a,t) is an attribute then a: c t is an operation over c; 

— if c <isa c' and {c' , a, t) is an attribute that does not conflict with c, then 
a: c t is an inherited operation over c; 
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— if c <isa c' and (c', a, t) is an attribute which conflicts with c, then (c')a : c — > 
t is a renamed operation over c; 

— if (c, m, . . . tfc) is a method then m: c(8>ti 0 . . ^ tk is an operation 

over c; 

— if c <isa c' and {c' , m,t\ . . . tk-itk) is a method that does not conflict with c 

then m: c 0 0 t 2 • • • ® tfe-i ^ tk is an inherited operation over c; 

— if c <isa c' and {c' ,m,ti . . .tk-itk) is a method which conflicts with c then 

the renamed method (c')m : c ® ® t 2 ■ ■ ■ ® 4-1 ^ tfe is an operation over c; 

— if c <isa c' then : c — > c' is an operation over c. 

The above considerations reflects actually the semantics that we have in mind for 
a class (see Section 4). Thus, we can see a schema as a set of types and abstract 
type specifications. When the data model is embedded in a type system, a schema 
can be seen as a set of types and abstract types specifications. For instance, in 
02 [12] and in GPL [5] schemas are defined in this way. 



3.3 Functional Terms 

In an object data model, calculations appear in two ways. On the one hand 
they serve to define the dynamic part (i.e. methods) of the database, and on 
the other hand they perform arithmetical computation. For instance adding or 
pairing attribute values, accessing a value by a path expression, or calling a 
method on an object. In our approach each calculation is a term of an algebra. 
This algebra acts on partial functions and is defined by the following rules : 



/: ti ^ 4 g: 4 ^ 4 


id* : t ^ t 


f.g: ti 4 




fi: t ^ti f2 - t ^t2 


fst: ti (E)t2 ^ ti 


snd :4®t2^4 <fi, f 2 > '■ t ^ ti (E) t 2 


ter* : t 


unit undef* : unit t 



Functional terms of a given schema are outputs of these rules whenever inputs 
are operations of the schema or operations of basic types. 

Note : The superscripte * stresses partiality and will be motivated in semantic 
level. 

Definition 4 Functional terms of a schema S are defined recursively as follows : 

— if V is a value of a basic type t then v * : unit t is a functional term, called 
a constant; 

— each basic type operation or class operation is a functional term; 

— if inputs of a rule are functional terms then so is its output. ■ 
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Note that the signature of a binary operation on a type t is no more t x t ^ t 
but 

Example 1 Suppose bool, int and string are basic types. Let add*, and mul* 
denote, respectively, the prefix notations of the usual operations +,*. The fol- 
lowing are functional terms in our example. 

— name: Pers string, name: Stud — > int, name: Prof string 

— <name, supervisor. name> : Stud string ® string, 

— <salary, ter* .12* > .mul*: Emp int 

— bonus : Emp — > int, bonus : Prof ® int — > int, [Emp)bonus : Prof — > int 

— <ifst.{Emp)bonus, <tsnd, ter* .100* > y>.add* : Prof ® int — > int. ■ 



4 The Semantics 

4.1 Database Instance 

Roughly speaking, we see an instance of a database as a finite set of persistent 
objeets and a code for each method. As usual an object is a pair (i,v) where i 
is an object identity and u is a value of a concrete type |t]. According to our 
recursive construction of |t], such a value may have a complex structure. Thus, 
V may refer itself to other object identities. The value v referring to an object 
identity j is an indication for saying that the type expression t has used a class 
name in its construction. But, since we have interpreted every class name by 
the same set oid, we are now unable to say what class name has caused the 
appearance of j in v. However, we need this lost information. The reason is the 
following natural principle which is supported in most object-oriented systems : 
If a persistent objeet refers to another objeet the later is also persistent. 

Definition 5 For every type expression t, every value v of |t] and every elass 
name c of the sehema S , the set ref{v : t, c) is defined recursively as follows : 

— ref{v :t,c) = 0, for every basic type t; 

— ref{v : c',c) = 0, if c' ^ c and ref{v : c,c) = {u}; 

— ref{v : set t, c) = ref{vi : t, c) U . . . U ref{vn : t, c), where v = {ui, . . . , u„}; 

— ref{v:titi)t 2 ,c) = ref{vi:ti,c) U ref {v 2 ■ t 2 ,c), where v = {v\,V 2 )- ■ 

The set ref{v : t, c) is the set of all object identities which appear in v because 
of the presence of c somewhere in the type expression t. 

In our approach a functional term e : ti — ^ ^2 does not denote a function 
from |ti] to |t2l but a program which corresponds to a partial function e : 
|ti] — ^ |t2l- But we see such a partial function as a function |e] : |ti] — ^ |t2l± 
such that ”e is undefined on x” means that e{x) € |unR] = {T} (recall that 
[fl± = [fl + [wnR]). Thus the symbol T expresses the undefinedness of functions. 
Expressing T as the undefined value coupled with the semantics of 0 , seen so 
far, allows a rigorous treatment of the null value in the database paradigm. Note 
that, the symbol T appears only in the codomain of |e]. 
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Definition 6 A database instance over a sehema S = {C,isa,att,meth) is a 
funetion 6 that assoeiates 

1. with every class name c of C a finite subset S(c) of old, sueh that: 

— if c <isa c' then 6{c) C 6{c'), 

— if c and c' have no eommon subelass and no common superclass then 
6{c) n (i(c') = 0 ; 

2. with every attribute (c,a,t) of S, a finite function as ■■ [cl — ^ ltj± such 
that : 

— def{al) C (i(c), 

— for all i e S(c) and c' G C, ref{ag{i) : t,c') C 5{c'); 

3. with every method (c, m,ti ■ ■ ■ tk-itk) of S, a functional term 

mg'. c(E> ti (E> . . . (E> tk-i ^ tk- ■ 

Note that in this definition each class name is seen as a persistent root. There is 
an explicit distinction between the stored and the computed part of the database. 
The stored part is defined by clauses 1 and 2 and the computed part by clause 
3. The stored part is finite, and consists of a finite set of objects and a set 
of finite functions, one for each attribute. The computed part is ’’infinite” and 
consists of a set of codes, one for each method. A code is an abstract syntax 
(i.e a functional term). The second part of the second clause of Definition 6 
says that: a persistent object cannot refer another object unless that object is 
persistent. This is actually the principle of persistence seen earlier. The first 
clause of Definition 6 requires that S{c) C S(c') whenever c <isa c' . This means 
that : the semantics of inheritance is set inclusion. 



4.2 Semantics of Rules 

In order to define the semantics of the rules we recall some practical notations : 

— For two sets Ti, T 2 , : T 1 XT 2 ^ Ti {i = 1, 2) will be the usual projections 

and : Ti — > Ti + T 2 the usual coprojections. When T 2 = {T} we write 

def '^^ : Ti ^ Ti + {T} and undef'^^ : {T} Ti + {-L} instead of and 

. We omit superscripts when there is no risk of confusion. 

— For fi \ A ^ Ai (i = 1, 2), the function < fi, f 2 >■■ A ^ Ai x A 2 is defined by 
< fi,h > (x) = ifi{x), f 2 {x)). Similarly for gi: Ai ^ A {i = 1,2) the function 
[gi,Q 2 ] ■■ Ai + A 2 — > A is defined by : 

[yiiQfifx) = if {x = iniy) then gi{y) else if {x = in 2 z) then 32 ( 2 ) . 

— If fi: Ai ^ Bi {i = 1 , 2 ) then /i + / 2 : Ai + A 2 ^ Ai + A 2 is an abbreviation 

for 0 / 2 ]. 

— For every set T the function ter: T {T} is the unique function from T 
to {T}, id: T ^ T is the identity function. An element a of T is seen as a 
function a: {T} ^ T, where a(T) = a. 

Now, we define the semantics of the rules as follows : 
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[id*] = defoid = def [f.gj = [lgj,undef]o[fj [</,fl>l =< [/], [ff] > 

[snd| = 

[ier*| = def oter {undef*} = undef 



Note that, at the syntax level we compose functions in left-to-right order which 
corresponds to program chaining; whereas at the semantics level we are us- 
ing the classical right-to left order’s composition of functions. The apparent 
complexity of the above semantics is due to our concern for treating null val- 
ues and undefinedness rigorously. This semantics says that: the operation |/st] 
(|sndj) is undefined whenever its first (second) argument is null. Note the in- 
troduction of [[g], unde/] for defining If.gJ. Indeed, |g] o |/] is ill- typed whereas 
[[gj, undef] o |/] is well-typed. 



4.3 Semantics of Functional Terms 

According to Definition 4, functional terms are obtained recursively from a 
schema S using rules. The basis of the recursion consists of constants, basic 
type operations and class operations. 

Semantics of basic operations : In our type system a binary operation of a 

basic type looks like op* : ^ t. Thus, its semantics is |op*] : ^ W-L- 

Since the semantics of t 0 t is |t] x |t] + {_L} x |t] -f |t] x {-L}, one of the two 
arguments of |op*] may be undefined (i.e. equal to _L). We assume that the result 
of |op*] is _L whenever one of its arguments is _L. Formally, 

|op*l = op + 

where op : |t] X |t] — ^ |t] is a usual binary operation on |t]. For a unary or 
a 0-ary operation f* of basic types we have |/*] = def o /. Similarly, if a is a 
value of a basic concrete type |t] then [a*] = def o a. 

The rest of the semantics will be generated from a database instance 6. 

Semantics of inheritance relationships : We have expressed c <isa c' 
syntactically as the operation : c ^ c' of c (Section 3.2). According to 

Definition 6, 6{c) C 6{c') thus, there is a partial function : [cl — ^ [c'l± 

corresponding to this inclusion. Therefore we define : ] = isaf’^ . 

Semantics of attributes : The attributes of a class act on objects of that class. 
More precisely, 

— For every attribute {c,a,t) in S, [a]'’ = a^. 

— For all classes c, c', if c <isa c' then for every attribute {c',a,t) which does 

not conflict with c, [a]'’ = .a] (= [[a]'’ , undef] o ]). 
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— For all classes c, c' , if c <isa c' then for every attribute (c', a, t) which conflicts 
with c, |(c')a]‘^ = 

The second (third) of the above clauses says: the semantics of a as an inherited 
(renamed) attribute of c is the semantics of a as an attribute of c' but restricted 
to objects in c. 



Semantics of methods : Contrary to attributes, methods operate on objects 
according to the designer’s/ user’s choice for static or late binding. In static 
binding a method operates in the same way on all objects of a class, but in late 
binding the operation on an object, depends on the way that the object is shared 
by other classes. The following are the semantics of methods that suit only for 
static binding (late binding needs a sub-typing and some constraints on schema, 
and will be treated in a forthcoming paper) : 



— For every method (c, m,ti . . . tk) in S, \mY = 

— For all classes c, c', if c <isa c' then for every method (c', m,ti . . . tk) which 
does not conflict with c, 

^ f (= [[mf',undef] o |fsa‘’’=']), if k = 1 

1 , snd > .m], if k > 1 

— For all classes c, c', if c <isa c' then for every method (c', m,t\ . . . tk), which 
conflicts with c, 




.m], if 

, snd > .m], if 



k = 1 
fc > 1 



5 Concluding Remarks 

We have introduced a formal object-oriented data model with partial semantics. 
Partiality has been used in the model oriented approach of computation theory 
for representing the possibility of failure during program execution. In this theory 
a partial map / from X to T is seen as a pair Df — ^ X, Df — ^ Y of total 
maps such that Df — ^ X is an inclusion [16]. We have considered a partial 
map as a total map X — ^ Y U {-L}, where _L is supposed to be outside Y . This 
point of view suits better to database theory in which failure corresponds to 
undefinedness. 

We have considered a database as a set of partial functions. Each function repre- 
sents an attribute or a method, and _L represents null value (or value undefined). 
A similar approach have been proposed in [14,15] with a categorical point of 
view, but without considering methods and binding modes. This paper inves- 
tigates with methods and static binding, and improved deeply their concept of 
inheritance. However, we restrict our study to methods without side-effects. 
According to [18] the objective of object-relational model is to extend the re- 
lational model by providing a richer type system including object orientation. 
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In this sense our model can be seen as an object-relational model. We have en- 
dowed the type system with an algebra of functions by means of rules. These 
rules are similar (but, not equivalent) to those presented in [5]. The similarity 
comes from the fact that they both contain a common mathematical structure. 
But our semantics for this structure is the universe of sets and partial maps 
whereas their semantics is based on collection types and total functions. For this 
reason we have introduced a particular semantics for 0, whereas they use the 
usual cartesian product x . Several aspects of particular interest have not been 
presented here, especially dynamic binding, query language and some interesting 
developments that concern category theory. These aspects will be reported in a 
forthcoming work. 
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1 Motivation 

The incentive to write a nested, heterogeneous container in C++ surfaced in 
the SuchThat project [11]. Therein we are working on the implementation of 
a SuchThat compiler. The first prototype’s back-end [14], as well as many of 
the other components, were implemented in Scheme [8]. One of Scheme’s main 
advantages is the powerful list data structure, which can hold arbitrary data 
types^. This allows the user to build nested lists, e.g. to represent a parse tree 
or symbol table. 

Our current focus is on merging Tecton [7] with SuchThat. Due to severe 
performance problems with our first prototype we have switched to C++ as 
implementation language. The STL provides basic containers that suit most 
simple needs and exhibit very good runtime behavior. The containers’ major 
drawback for our purposes is the inability to hold objects of different types and 
that they do not support nesting. 

We will show that exploitation of C++’s newest technologies, like templates 
and run-time type information (RTTI), leads to a powerful data structure based 
on the STL. We think that the different paradigms of generic, object oriented 
and functional programming, which often are seen as adversaries, can instead 
complement each other. 



2 Approaches 

We observed a trade off between syntactic elegance and runtime performance. 
This made us come up with two fundamentally distinct approaches. 

The first one, the more conservative, relies on an abstract base class that pro- 
vides polymorphic behavior with easy to use parameterized standard elements. 
The nseq class uses template template arguments (see [1], 14.3.3) for maximal 
flexibility. 

The second approach builds on the semantics of chameleon objects [12]. It is 
outperformed by the first one regarding runtime but excels in usability. 

^ Of course, this holds true for any untyped language. 
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2.1 Specification of the Problem 

We informally state with the following three requirements what we call a het- 
erogeneous, nested sequenee S. 

1. Every STL sequence container should be applicable as underlying implemen- 
tation container of S (flexibility property). 

2. S should be able to hold arbitrary objects (heterogeneity property). 

3. Any nested sequence S should be able to hold other nested sequences recur- 
sively (nesting property). 



2.2 Classical Polymorphism 

The well established way in C++ to provide polymorphic behavior uses in- 
heritance. The heterogeneous container holds pointers to a base class and the 
C++ runtime system will dispatch methods based on the polymorphic type. 
Our base class BaseElem declares the virtual functions BaseElem* clone () and 
BaseElem* create () to support the virtual constructor idiom (see [2], 20.5). 

Instead of letting the user write wrapper objects for every type he uses, 
we deliver the template class ElemO that inherits from BaseElem. The signa- 
ture is template <class valT> class Elem : public BaseElem. This wrap- 
per class does all the tedious work a user usually has to do on his own: define 
constructors and destructors, as well as various auxiliary methods (e.g. I/O func- 
tions). She must only instantiate the template class. This works for basic types 
and classes, e.g. Elem<int> i or Elem<string> sC'test"). 

Let us examine the class nseq, which should fulfill the problem specification 
given in section 2.1. The heterogeneity property can be obtained by keeping 
pointers to the base class BaseElem in the sequence. If we want to comply to the 
nesting property, nseq must be derived from BaseElem itself. Furthermore, in 
order to use STL containers, nseq must also be a subclass of such a container. 
These considerations lead to this signature of a nested list: 

class nlist : public list<BaseElem*> , public BaseElem 
This works fine, but it does not fulfill the flexibility property, because the im- 
plementation container is hard-coded. You cannot provide different containers, as 
the class nlist is a subclass of the STL container instantiation list<BaseElem*> 
To gain the desired additional level of abstaction, we use template template argu- 
ments, a very novel C++ feature, nseq becomes a template class, whose template 
argument is a container, which is a template class itself. Therefore, a simple tem- 
plate would not suffice. The final class header for nseq reads: 
template <template <class valT> class containerT> 
class nseq : public containerT<BaseElem*> , public BaseElem 
The clone 0 -function has a boolean parameter shallow, which is of impor- 
tance for nested sequences only. It controls if either a shallow or a deep copy of 
the container is made. A shallow copy creates a new container with pointers to 
all top-level elements. A deep copy creates a new container, recursively holding 
copies of the containers and atoms in the source sequence. 
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Working with our nseq class is quite simple. An instance of a nested deque 
is created with nseq<deque> nd. We can use all of deque’s member functions 
to add elements to our container, e.g. nd . push._back (new Elem<int>(4711) ) . 
The sequence can be walked with the STL container’s iterators, but only at the 
top-most level. You can recursively descend into nesting layers, if the member 
function bool is_atom() returns false, which is the case for elements that are 
nested sequences. When you walk a nseq and want to operate on the elements, 
you have to perform a dynamic cast on BaseElem. The following code example 
shows how the first level of a nested sequence is walked with the container 
provided iterator and every integer entry is replaced by its square power. 

for (nseq<list> :: iterator iter = nl.beginO); iter != nl.endO; ++iter) 
if (intp = dynamic_cast<Elem<int>*>(*iter)) 

*intp = intp->getValue 0 * intp->getValue () ; 



Figure 1 compares the layout of nseq<vector> and vector<int>^. It shows 
that we get a memory overhead of two pointers (eight bytes) for every element, 
regardless of the wrapped object’s type. The first one points to BaseElem. This 
indirection is needed to use C++’s polymorphic mechanism. The second pointer 



holds the address of the element’s virtual table. 
vector<int> nseq<vector> 




int 4 int 4 int 4 


BEIem*4 


BEIem*4 


BEIem*4 
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int 4 




object of size 
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Fig. 1. Layout comparison of a standard STL container and a nested sequence 



2.3 Chameleon Objects 

In [12] a new technique for providing a generic, type-safe wrapper class is pre- 
sented. This goal is achieved through the unparameterized class Value. Contrary 
to the class itself, all its methods, like the constructor and a set of overloaded op- 
erators, are parametrized, i.e. template functions. Thus, any object of arbitrary 
type can be assigned to a Value object due to the parameterized assignment op- 
erator template<class T> T operator=(const T&) . In turn, a Value object 
can easily be reassigned to an object of its initial type because of the parame- 
terized conversion operator template< class T> T& operator T(). 

The Value class guarantees strict type safety by signaling any attempt to 
assign a Value object to another object of incorrect type by throwing an excep- 
tion^. To achieve this functionality, the Value class keeps all information about 
the wrapped object in a private, static data member inside a member function, 
parameterized with the same type. 

^ We assume that sizeof(int) = 4 and sizeof(pointer) = 4, which is true for most 
contemporary 32bit architectures. 

® Type identity is defined as name identity here. 
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Since all used data types are known at compile time, the compiler can instan- 
tiate the corresponding methods and data objects. In fact, for every data type 
used in conjunction with a Value object, a full set of operators and methods 
is instantiated by the compiler. The type checking of any operation concerning 
a Value object is performed by these methods at run time. Because of their 
ability to change their internal type at any time. Value class instances are called 
chameleon objects. 

Given this Value class, we can now construct heterogeneous containers based 
on the standard STL containers by instantiating such a container for Value with 
list<Value> polyCont. Thereafter, objects of arbitrary type can be inserted 
into the container, e.g. polyCont . push JaackC 12) , polyCont . push JoackCO . 9) 
and polyCont .pushJaackCstringC'hello")) . Because, as stated above, Value 
objects can hold objects of any type, even containers can be inserted as elements 
into a Value parameterized container, thus obtaining nested containers. 

The extraction of elements from the container is straightforward, too. If we 
know the the desired element’s type, we can simply query it. Given the list 
polyCont from the previuos example, we can write int i=polyCont .front () to 
get the first element of the list. More care must be taken if the type of the desired 
element is unknown. Since overloading the type id () operator is not allowed in 
G++ (see [1], 13.5), the Value class defines a method typeldO, which returns 
the type information for the currently wrapped object. With this information 
and Value’s parameterized method template<class T> T& getValueO, one 
can access every element of the nested, polymorphic container. This is shown in 
the following sample code: 

// double sin(double) ; a function witb this signature must exist 
void apply (list <Value> fecont) { 

for (list <Value> :: iterator it = cont .beginO ; it != cont.endO; ++it) { 
if (it->typeld() == typeid(list<Value>) ) apply ( (list<Value>) *it) ; 
else if (it->typeld() == typeid(int)) *ft = (int)*it * (int)*it; 
else if (it->typeld() == typeid (double) ) *it = sin(*it) ; 

> 

} 

3 Performance Tests 

The results of the performance tests are presented in Figure 2. We compared the 
original STL list and vector containers against their nested counterparts based 
on our implementations. The tests consisted of two parts, container creation and 
element access. They were all performed for int and std: : string data types. 
The containers were filled in a loop using push_back(T&) . Access was measured 
by iterating over the created container and mutating its elements. 

The charts show the overhead introduced by our containers for the flat, ho- 
mogeneous case, where of course their additional features are not used. The price 
you have to pay for nesting and heterogeneity is a runtime penalty ranging from 
1.3 to 1.9 for complex objects (std: : string in our tests) and 2.9 to 13 for built 
in types (int). 
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Furthermore, we want to note that due to the extensive use of the new op- 
erator in our element classes, the tests depend heavily on the applied memory 
allocation scheme. Therefore our source code includes the smart memory allo- 
cator presented in [5], which speeds up the tests significantly compared to the 
default new operator. 

Our classes make heavy use of new C++ language features like RTTI and 
templates. We were able to compile our code at the time of this writing with 
egcs 1.1 [3], the EDO front-end [4] and IBM VisualAge C++ 4.0 [6]. 




Fig. 2. All tests ran on a Pentium II, 333 MHz, 128MB machine under Windows 
NT 4 and were compiled with egcs 1.1.1. The container size is 400000 elements 

4 Results 

We presented two distinct approaches to the problem of implementing a nested, 
heterogeneous container in C++. The classical one shows better runtime per- 
formance. Its overhead, compared to a standard STL container, arises from the 
pointer indirection, which is necessary for the polymorphic approach, and the 
virtual table pointer in the ElemO class (see Figurel). This generic wrapper 
frees the user of boring work. One drawback is the pointer semantics, uncom- 
mon in the value semantics of STL containers. It also forces the user to handle 
most aspects of memory management. 

The container based on chameleon objects offers syntactic elegance that 
equals untyped data structures, like those present in Scheme. Operations on 
elements are inherently type-safe, type violations are signalled by exceptions. 
All this is made possible by the seamless integration of STL containers with the 
Value class. It hides much of the details of casting and type checking from the 
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user, which is still visible in our classical approach. No efforts must be taken by 
the user to adapt objects for storage in the nested container. The beauty of this 
approach is bought at the cost of increased runtime. 

Our classes leave it up to the programmer to choose either faster executing 
code or more elegant source code. 

5 Future Work 

We currently focus on implementing a f lat_iterator, which behaves like a sim- 
ple sequence iterator and traverses all elements in a nested sequence in depth first 
order. This iterator enables us to use STL algorithms on our nested sequences. 

Another interesting question will be the use of different memory allocators 
and the implementation of garbage collection ([10]) for the objects stored in the 
containers. We believe this task can be addressed efficiently and transparently for 
the user through the introduced element classes ElemO and Value, respectively. 

Finally, one can think of a reference counting mechanism, implemented also 
through the mentioned element classes, which can lead to a dramatical per- 
formance increase, especially in situations where deeply nested containers are 
heavily copied for read only purposes. 
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Abstract. For data flow analysis of Java program to be correct and pre- 
cise, the flows induced by exceptions must be properly analysed. In our 
data flow analysis, the implicit control flow for a raised exception is repre- 
sented explicitly. Exception branches, exception plateaus, and exception 
exits for methods and method calls are introduced as additional control 
flow structures for analysis of exception handling. These structures are 
constructed dynamically under control of data flow analysis. 



Introduction 

Java [7] is a new programming language that integrates many useful features of 
modern languages such as C++ and Oberon-2. In Java, exceptions are elaborated 
as a quite natural mechanism highly integrated with other parts of the language. 
Exceptions may be thrown by methods of the standard Java classes, and a user 
program may catch and handle them. Obviously, exceptions are widely used in 
Java programs. 

Exceptions pose new challenge to developers of data flow analyses. An ex- 
ception, raised in a method body, induces a control flow other than the main 
control flow from the method call. So, at the end of the method body, a proper 
analysis must separate the data flow calculated for the raised exception from 
the main data flow. Now, a typical data flow analyser either ignores exceptions 
or, in the best case, roughly mixes data flow for the raised exception with the 
main data flow. The only known approach of proper analysis of programs with 
exceptions is described in [1]. 

Data flow analysis implemented in the static error checker OSA 
(Oberon-2/Modula-2 Static Analyser) [2] ignores exceptions in Modula-2 prog- 
rams. Of course, OSA analysis is not correct for exceptions. Nevertheless, there 
was almost no problem with it so far because exceptions are rarely used in real 
Modula-2 programs. For the OSA analysis of Java programs to be correct, proper 
analysis of the exception handling is needed. There exists an additional moti- 
vation. In Java programs, catch clauses are rarely executed and therefore are 

* This research was supported by the Russian Foundation for Basic Research, RFFI 
97-01-00724 
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difficult for testing. So for non-trivial exception handling, it is highly probable 
for our static analyser to find errors induced by exception handling. 

In our data flow analysis, the implicit control flow for a raised exception 
is represented explicitly. Exception branches, exception plateaus, and exception 
exits for methods and method calls are introduced as additional control flow 
structures for analysis of exception handling. These structures are constructed 
dynamically under control of data flow analysis. The hypergraph representation, 
previously applied for statements of Oberon-2/Modula-2 programs, is used for 
method bodies and method calls of analysed Java programs. 

The rest of this paper is organized as follows. First, we present the overview 
of the basic data flow analysis applied in the static analyser OSA. In the second 
section we describe the Java subset implemented. Our analysis of exception 
handling is described in next two sections. In the third section we introduce the 
new notions and new structures applied in the analysis. Next, we present the 
analysis of all exception-related Java constructs. In the hfth section we outline 
other approaches in analysis of exception handling in the related work section. 
In the conclusion, we give some remarks on the implementation. 

1 Data Flow Analysis Overview 

The static error checker OSA (http://www.xds.ru/osa/) checks programs for 
run-time errors by analysing the source code. The powerful data flow analy- 
sis used in OSA is able to detect various kinds of Modula-2 and Oberon-2 
dynamic semantics violations, which are usually found during debugging and 
testing stages of program development. 

All known to us source code checkers (e.g. for the C/C++ languages) that 
detect run-time errors may produce only long lists of warnings due to weakness of 
analysis they perform. In order for a source code checker to be useful in practice, 
it must be able to recognize dehnite errors for really complicated erroneous 
situations. It was shown [2] that at least the context-sensitive data flow analysis 
with approximation of dehnite def-use relations must be done in such a static 
error checker. 

OSA includes the following analyses: 

- context-sensitive and context-insensitive data how analyses; 

- approximation of the dehnite def-use relations along with the possible ones; 

- calculation of variable values: points-to must- and may-aliasing analyses for 
reference variables and propagation of value ranges for variables of scalar 
types; 

- calculation of branch reachability for conditional statements; 

- rehnement of variable dehnitions through conditions for branches of condi- 
tional statements; 

- approximation of previous instances of heap variables and local variables of 
recursive procedures. 
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OSA analysis is structured as a sequence of the following analysis phases: 



— context-insensitive analysis phase; 

— context-sensitive analysis phase; 

— variable value calculation phase; 

— backward analysis of unused values of variables (*); 

— error analysis phase. 

Except the fourth phase (*), all analyses are forward. Data flow analysis is 
implemented as abstract interpretation of a program [5]. Data flow representa- 
tion is based on the SSA form [3]. At every program point, a Def context is 
calculated as a result of the interpretation. A Def context is the set of variable 
definitions that are valid at the program point. The Def context produced by 
the first phase for the entry of each method is used as upper approximation in 
the second analysis phase. 

In data flow analysis, the control flow is represented by a structure different 
from the traditional control flow graph. A program statement is a hyperleg that 
is a construct with one entry and possibly more than one exit. For example, a 
loop body with break and return statements is represented as a hyperleg with 
at least three exits: 



— for normal loop body end; 

— via break statement; 

— via return statement. 

The whole program is represented as a hierarchical hypergraph [4] . The con- 
trol structures of statements of the program source code are preserved in the 
hypergraph representation. 



2 Java Subset Implemented 

There are Java language features which implementation in data flow analysis is 
impossible or highly ineffective. In the current implementation, finalize methods 
are ignored. The order of the static class initialization of in the OSA analysis may 
be other than declared by Java semantics. OSA cannot analyse programs with 
classes that are loaded dynamically during execution. Threads are processed in 
the OSA analysis as sequential programs; proper implementation of thread anal- 
ysis is now under development. Runtime exceptions (null dereference, division 
by zero, etc.) are handled by the OSA analysis only if they were recognized as 
definite; possible exceptions are ignored because their implementation would be 
ineffective and as a rule useless. The Java subset currently implemented in the 
OSA is almost the same as in [1]. 
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3 Structures for Exception Handling Analysis 

Unlike the approach [1], the implicit control flow for raised exceptions is repre- 
sented explicitly in our data flow analysis. The difficulty is that the additional 
control flow structures have to be constructed dynamically in the process of data 
flow analysis. 

For each throw statement, an exception branch is introduced. An excep- 
tion branch has a label and a Def context at the point of the throw statement. 
A label is the set of types of exceptions that raised by this exception branch. 
When a throw statement is interpreted in data flow analysis, the exception 
branch associated with that throw statement is attached to the current excep- 
tion plateau. A plateau is the place where exception branches are collected for 
further processing. There are three kinds of exception plateaus: 

— catch plateau, inserted after the try block and before the first catch clause; 

— finally plateau, inserted before the finally block; 

— end method plateau, placed at the end of the method body. 

An exception branch that reached the end of the method body is placed into 
the end method plateau. When interpretation of the method body has completed, 
that exception branch is connected with some additional exception exit of the 
method. So a method body is represented in data flow analysis by the following 
hyperleg: 




Fig. 1. Hyperleg for a method body. Here exiti, ..., exitfe are exception exits 

An exception exit has a label (the same as for exception branch) and a Def 
context. 

A method call is represented by a hyperleg of the same structure as for 
method body. Each call exit begins some branch of a program. An exception 
exit of a method call begins some exception branch, which label is the same as 
for the exit. At the end of interpretaion of a method call, the exception branch 
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associated with an exception exit of the call is attached to the current exception 
plateau. 

New structures and their interrelation in the analysis of exception handling 
are shown on the example program in Fig. 2. 



public static test { 
static mt s.y; 

static test obj = new testQ, 



void gO throws itiyEsc ( 

if(yNl) throw new myExcO; 



(end method plateau 



( gjti (Sxi t jO . 

static public void mam (String args[]){ 



r catch nkteau ^ catch (ttiyEsc e) 

^ ^ {if(sNl) throw new ErrofO; } 

(’ finally plateau ^ 



((end method plateau 



finally ( s =1; } 



} 



Fig. 2. Processing of exception flows for the program example 



4 Implementation of Exception Handling 

When interpretation of a try block has completed, the catch plateau of the 
try statement is interpreted. For each exception branch included in the plateau, 
the branch label is matched to parameters of catch clauses, according to the 
Java language semantics. As a result, the exception branch is attached either 
to some catch clause or to the plateau of the innermost enclosing construct. If 
more than one exception branch is attached to some catch clause, the merge 
statement for the entering Def contexts would be inserted before the catch 
clause. If the branch label is partially matched to the parameter of any catch 
clause, the branch label is splitted, and the new exception branch is created with 
the part of the label that did not match to the catch clause parameter. That 
exception branch would be matched to the remaining catch clauses. 

In the inner program representation, a finally block is represented as an 
independent procedure whose calls are inserted into all appropriate places of 



394 



Vladimir I. Shelekhov and Sergey V. Kuksenko 



the try statement. Such decision guarantees that different data flows in the 
try statement would never be mixed. For each exception branch of the finally 
plateau, a call of the finally block is dynamically inserted; in accordance with the 
Java language semantics, the main exit of this call is labelled with the exception 
branch label. 

When interpretation of a method body has completed, the end method 
plateau is interpreted. For all exception branches of the end method plateau, 
method exits are dynamically constructed so that the following conditions are 
met: 

— the label of each exception branch is a subset of the union of labels of method 

exits; 

— for each exception branch and for each method exit, either the exit label is a 

subset of the branch label, or branch and exit labels do not intersect. 

These conditions guarantee that data flows of two exception branches with dif- 
ferent labels would never be mixed. If several exception branches are connected 
with one method exit, a merge statement for the entering Def contexts will be 
inserted to produce the target Def context for the exit. 

For a method call, data flow analysis calculates all methods which may be 
invoked by this call. For each invoked method and for each exception exit in this 
method, the call must include exit with the same label as for the method exit. If 
this is not true, the new exit with the needed label will be created for the call. 

5 Related Work 

The only known approach of data flow analysis that properly handles excep- 
tions is described in [1]. In this article, data flow information (the conditional 
points-tos) may be additionally labelled with exceptions. This is a natural but 
not trivial extension of the context-sensitive Landi-Ryder pointer aliasing algo- 
rithm [6]. 

The problem of fast static calculation of possible uncaught exceptions in 
SML programs was solved [8]. A program call graph and exception flows are 
estimated from sets of equations and constraints. As for Java language, a Java 
compiler must guarantee that each raised checked exception will be caught. For 
unchecked exceptions, the analyser OSA produces a warning message for each 
uncaught exception. 

6 Conclusion 

Unlike the approach [1], the implicit control flow for raised exceptions is rep- 
resented explicitly in our data flow analysis. Method bodies and method calls 
are presented as constructs with one entry and possibly more than one exit. In 
previous works, control flow structures of the analysed program are constructed 
before data flow analysis. In our approach, new control flow structures for ex- 
ception flows are constructed dynamically under control of data flow analysis. 
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Our implementation of the exception handling by means of the extension of 
the control flow mechanism appears to be considerably less complicated than 
that one described in [1]. 

Data flow analysis with the presence of exceptions was implemented in the 
static analyser OSA for Java language. Beta version of the analyser OSA is 
available at http://www.xds.ru/osa/. In the process of OSA debugging, many 
real Java programs were passed through OSA. The number of different exits in 
a method body was always less than six. According to our estimations, the size 
of memory and processor time required for exception exits, not exceed 20% of 
memory and time required for all analysis. As for errors found by the analyser 
OSA in Java programs, not trivial catch clauses proved to be the most error 
prone. 

Authors are grateful to Dmitry Leskov of XDS for many useful critical notes 
concerning this paper. 
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Abstract. In distributed object systems, one has the possibility to make 
method invocations on objects located on other host. During such an 
invocation, data is sent to another host and back. However, the system 
tries to hide this and simulate a standcard method invocation as close as 
possible. Some systems [Voyager] try to offer other invocation semantics, 
e.g. asynchronous method invocation. 

We try to go a step further and offer the actual invocation as first class 
abstractions. The programmer can build his own abstractions by either 
implementing his own or by combining existing abstractions. With this 
system, he can build arbitrary invocations semantics, e.g. synchronous 
method invocation with transactional semantics, which also logs all meth- 
od invocations. 



1 Overview 

Today’s highly interconnected systems put more and more emphasis on the ex- 
ploitation of the advantages inherent to a network, i.e. increased fault tolerance, 
better availability, and easier scalability. However, network systems have their 
disadvantages as well, and it is not easy to actually exploit their advantages. 
Independent failure modes, which have to be handled when dealing with several 
computers, increase the complexity of software development. Additionally, net- 
worked systems are often heterogeneous and highly dynamic. The configuration 
of available computation resources may change on a moments notice. To cope 
with these problems different approaches have been proposed. A common ap- 
proach is to put part of the additional complexity into the object system, i.e., 
to hide it from the developer, by extending the notion of objects and classes. 

However, distributed object systems, e.g. Object Management Group’s OMG 
[OMG], Microsoft’s DCOM [Micro], or JavaSoft’s Remote Method Invocation 
[RMI98] use a fix scheme of a point-to-point request /response communication 
model. While appropriate for a subset of applications using distributed objects, 
this model inhibits exploiting the advantages of distributed objects for other 
domains. Other work has been done to widen the application domain for dis- 
tributed objects by introducing new kinds of method invocation semantics, e.g. 
Voyager [Voyager] which introduces asynchronous method invocation. 
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This paper describes a novel approach to widen the application domain for 
distributed objects even further. We claim that introducing new special cases 
as done e.g. in Voyager is not sufficient. There is an infinite number of possible 
desirable kinds of method invocation, e.g. asynchronous vs. synchronous, unicast 
vs. multicast, replicated, transactional, logged, or atomic. We claim that just as 
any other aspect of a distributed system, the ’’invocation style” should be a first 
class abstraction. One should be able to compose abstractions and use the most 
adequate ones according to the application needs. 

1.1 Distributed Object Methodology 

A client sees an object as a reference into memory, some data fields, and a set of 
type bound procedures (methods). An application does not have to distinguish 
between local and remote objects. 

The application has transparent access to all objects regardless of their actual 
location. For every accessed remote object, the system automatically generates 
a so-called stub object. A stub object is the local representative (placeholder) 
of an object located on another site. It offers exactly the same interface as its 
associated actual object, but redirects incoming requests to the actual object. 
The request (object ID, invoked method, and actual parameters) is transformed 
(marshalled) to a byte stream, which is sent from the stub to the skeleton. This 
stream includes all information needed to reconstruct the receiver object, the 
called method, and the actual parameters. This mechanism is similar to the 
RPC mechanism [BiNe84, Tan95], except that a receiver object is passed along 
with each new invocation. 

1.2 Code Generation for Stub and Skeleton 

Stub and skeleton code is generated automatically from the interface definition 
of the given class. Typical stub and skeleton code consists of three parts: 

1. Marshalling of all input parameters 

2. Activation of the transport mechanism in order to signal the actual object 

the intercepted method invocation. 

3. Unmarshalling of output parameters and the return value 

Logically seen we introduce one new additional layer (see Fig. 1). A method 
invocation, which is not handled locally, is intercepted by its corresponding stub 
method. Each stub method is tailored to its method and is mainly concerned 
with marshalling. After the marshalling is done, the stub code calls, regardless 
of the invocation mode, the global invocation handler. This handler chooses 
and activates the previously assigned invocation mode. The structure of the 
invocation modes is explained in the next section. 

In our current prototype implementation, we achieve this behaviour without 
introducing a new layer by using an array of invocation abstractions. Each stub 
knows the method to be used and chooses the correct one through an index into 
an invocation array. With help of this mechanism we avoid the additional layer 
and achieve a faster dispatch. 
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client: obj.MlstubQ: 

/ marshall parameter^ 
obj.MlQ HandleInvokeO 

unmarshall parameters 



HandleInvokeO : 

Invoke remote method with 
appropriate invocation mode 



Fig. 1. Control flow on method invocation 

2 Generating Invocation Modes 

We offer the programmer a class hierarchy of invocation abstractions. Invocation 
is the abstract base class. Whenever an object is to be exported (made public 
to other hosts), the programmer must specify the desired invocation modes for 
every method of the object he exports. One can have an individual invocation 
configuration for each object of a class or reuse a configuration for all objects of 
the same class. 

To export an object one has to call the procedure Export that is part of a 
library. One has to specify the host on which the given object is exported, the 
name of the object and the desired invocation abstractions. As a result of this 
operation, the system generates ’on-the-fly’ the necessary skeleton code to access 
this object (see example below). 

invoke := Invocations. GetClasslnfo("className’’); 

// ... modify abstractions to current needs 
Export(objectl, host, namel, invoke); 

// ... modify abstractions if necessary 
Export(object2, host, name2, invoke); 

When a client imports an object, it calls the procedure Import. One has to 
specify the name the object and the host where it resides. An appropriate request 
is sent to the server host. The server host sends back two kinds of information. 
First, the invocation abstractions of the exported object and second, the actual 
object data. With help of the received invocation abstractions the necessary stub 
code is generated and the actual object is generated. 

lmport(obj, host, name); 



The necessary invocation information is generated with a call to GetClass- 
Info, which uses meta-programming facilities to collect it. It returns the default 
invocation information for a class. If an object is exported with this information, 
one gets the following default behaviour: 

1. Methods are called synchronous with the standard semantics of method calls. 

2. Parameters of a pointer type are copied using deep-copy semantics. 
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3. If a method returns a pointer value, the referenced object is copied to the 
caller. 

These three standard behaviours can be changed as described in the following 
three sections. 



2.1 Changing the Invocation Mode 

By default, all method invocations are handled as standard method invocations, 

i.e. synchronous. However, one can change the behaviour as needed by compos- 
ing your own invocation abstractions. Either one can create a new abstraction 
that suits the current necessities, or one can compose one with help of existing 
abstractions using the decorator pattern [Gam95]. Let us look at some examples: 

1. An asynchronous invocation abstraction, which uses replication and logs the 
invocations: 



VAR 

myinvoke: Invocation; 

myinvoke := LogMode(ReplicationMode(ASynclnvocation())); 

2. A synchronous invocation with transaction semantics: 

VAR 

myinvoke: Invocation; 

myinvoke := TransactionMode(Synclnvocation()); 

3. After generation of the desired invocation abstraction one can assign it to 
the desired method(s) and assign it to an exported object: 

invoke. Method(” name of method" , myinvoke); 

Export(obj, host, name, invoke); 



If one wants to implement an own transaction invocation one has to create 
a new subclass of the class Invocation and overwrite the method 

PROCEDURE (i:lnvocation) lnvoke(obj:PTR; id:LONGINT; s:Stream):Stream; 



The method will be called whenever a method that uses this invocation 
abstraction is activated, obj is the invoked method, id contains a unique num- 
ber defining the called method and s contains the marshalled parameters. The 
method has to return the linearized return value and output parameters. 
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2.2 Changing the Copy-Mode for Individual Parameters 

Method can have pointer parameters. This implies, if the method is executed 
remotely, that the referenced objects have to be transferred from the client to 
the server and back. Either one can make a deep copy, actually generating a 
copy of the referenced object on the other host, or one can make a shallow 
copy. A pointer parameter copied in shallow copy mode is not transferred to the 
server. Instead, the object is automatically exported with an anonymous name. 
On the server side, before the server method is invoked, a corresponding import 
statement is executed automatically (see Fig. 2). 




Fig. 2. Shallow copy of parameter 



2.3 Changing the Copy Mode of the Return Value 

Methods can return pointer values. This implies, if the method is executed re- 
motely, that the referenced object has to be transferred from the server to the 
client host. As with pointer parameter (section 2.2) one has the possibility to 
make a deep or a shallow copy, i.e. the procedure returns either the actual object 
(deep copy) or another stub object (shallow copy). 

3 Conclusions 

A method invocation that is not handled locally is intercepted by its corre- 
sponding stub method. Each stub method is tailored to its method. The stub 
is concerned mainly with marshalling. The actual invocation is delegated to the 
procedure Handleinvocation (see Fig. 1). When called, Handleinvocation decides 
on the actually used invocation mode: 
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Handleinvocation (rec: PTR; id: LONGINT; data: Stream) : Stream; 
info := ... Invocation information for the object rec 
invoke := invoke mode for method id in info 
data := invoke. lnvoke(rec, id, data) 

RETURN data 

An actual implementation of Invoke will do some invocation specific state- 
ments (open/close transaction...) and delegate the invocation to the decorated 
invocation mode, e.g. for the invocation mode resulting from the statement 

invoke := LogMode(TransactionMode(Synclnvocation())); 
the actual sequence of invocation modes is as shown in Fig. 3. 




Fig. 3. Example abstraction sequence 



The invocation mode is defined only at runtime when an object is exported. 
Each time one exports an object, one can choose other invocation abstractions. 
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Abstract. We propose a strategy language for designing single con- 
straint solvers as well as their collaborations. Based on the notions of 
constraint filter, separator, and sorter, we define basic strategy opera- 
tors that allow us to specify single solvers and their collaboration in a 
uniform way. We exemplify the use of this language by specifying some 
techniques for solving non-linear constraints over real numbers and CSPs 
over finite domains. 



1 Introduction 

In the last twenty years, a lot of work has been done on solving Constraint Sat- 
isfaction Problems (CSPs) [8]. The existing constraint solvers have been success- 
fully applied for solving real-life problems. We could say that constraint solving 
over a particular domain is well-understood. In the case of solvers based on prop- 
agation, either the control is left at the implementation level, or the strategy is 
fixed. For example, to be completely formal when adding strategies to Chaotic 
Iteration, we must prove that the algorithm and the strategy really compute 
the same fixed-point as Chaotic Iteration alone [1]. Arc-consistency algorithms, 
originally developed just for binary constraints, use fixed strategies and fixed 
data structures, thus, it is not possible to change the strategy [10,4]. Solvers 
based on other techniques, such as Grobner bases and the simplex method, use 
a dedicated strategy. Finally, the deductive approach used in COLETTE allows 
a fine control of the computation, the strategy being a parameter, but there is 
no solver collaboration and some features are hidden in the implementation lan- 
guage (such as associative-commutative properties and term manipulation) [6]. 



Given that the development of constraint solvers is, in general, an expen- 
sive and tedious task, the interest for reusing existing solvers is obvious [17]. 
Even more important, when dealing with problems that cannot be tackled or 
efficiently solved with a single solver, we definitively realize the interest of inte- 
grating several solvers, working, in general, over different domains [15,3,9,16,14]. 
This is called Collaboration of Solvers [11]. In order to make solvers collaborate, 
the need for powerful strategy languages to control their integration has been 
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well recognized [12,13,2], The existing approaches consider a fixed domain (linear 
constraints [3], non-linear constraints over real numbers [14,9,7]), a fixed strat- 
egy, and a fixed scheme of collaboration (sequential [14,7], asynchronous [9]). In 
the language Bali, the collaboration is specified using control primitives. The con- 
straint system is a parameter, but the control capabilities for specifying strategies 
are not fine enough [13], 

In this paper, we propose a control language for specifying single constraint 
solvers and their collaborations. Based on [5], a solver is viewed as a strategy that 
specifies the order of application of elementary operations expressed by transfor- 
mation rules. In this framework, different domains mainly mean the definition of 
different transformation rules, and different heuristics mean different strategies. 
Extending this idea, we consider the collaboration of solvers as a strategy that 
specifies the order of application of single, or component solvers. 

Our main motivation is to provide a general framework for defining single 
constraint solvers in a formalism that allows to specify high level operations on 
constraints as well as syntactical transformations normally hidden in the current 
implementations of constraint solvers. Our interest is to define this framework 
in a way that allows its natural extension for specifying the collaboration of 
solvers, since the design of constraint solvers and the design of collaboration of 
solvers require similar methods (strategies are often the same: don’t-care, fixed 
point, iteration, parallel, concurrent, ...). In other words, we propose a language 
for writing single solvers and collaboration of solvers at the same level, mak- 
ing explicit things that are generally hidden in the implementations: strategies, 
properties of the operators (such as AC property), term manipulation, ... 

We have already used our control language to design several solvers with 
several strategies: A Simplex algorithm, Grbbner bases computation and some 
propagation based solvers for finite domains and real numbers. However, for lack 
of space, we only present in this paper examples over two domains: some solvers 
for constraints over real numbers, and some solvers for finite domains. 

This paper is organized as follows: Section 2 presents some standard defini- 
tions. In Section 3, we introduce the basic components of our control language 
which is later presented in Section 4, and illustrated in Section 5 with the design 
of solvers for constraints over real numbers and over finite domains. In Section 6, 
we conclude the paper. 

2 Definitions 

Definition 1 (Constraint System). 

A constraint system is a f-tuple {U, V, V, C) where: 

— U is a first-order signature given hy a set of function symbols 3^^, and a set 

of predicate symbols V^, 

— T> is a S -structure (its domain being denoted by \T>\), 

— V is an infinite denumerable set of variables, and 




404 Carlos Castro and Eric Monfroy 



— £ is a set of constraints: a non-empty set of {U ,V)-atomic formulae, called 
atomic constraints, closed under conjunction and disjunction. The unsatisfi- 
able constraint is denoted by _L and the true constraint is denoted by T. The 
set of atomic constraints is denoted by Cai ■ 

An assignment is a mapping a ■. V ^ \D\. The set of all assignments is 
denoted by AS'S'^. An assignment ex extends uniquely to an homomorphism 
(X : T{U,V) I'D]. The set of solutions of a constraint c G £ is the set Solt>{c) 
of assignments ex G AS'S'^ such that gfc) holds. A constraint c is valid in V 
(denoted by 2? |= c) if SoIt>{c) = ASS'^. We denote by Var(c) the set of variables 
from V occurring in the constraint c. Finally, we introduce the notion of solver. 

Definition 2 (Solver). A solver for a constraint system {U,T>,V,C) is a com- 
putable function S : £ ^ £ such that 

1. VC G £, SoIt>{S{C)) C SoIt>{C) (correctness property) 

2. VC G £, SoIt>{C) C SoIt>{S{C)) (completeness property) 

A constraint C is in solved form with respect to S if S{C) = C. 

Given a constraint system [U, T>, V, £) and a solver S over (A, T>, V, £'), such 
that £' C £, we extend S to {}J,T>,V,£) in the following way: VC E £\£', 
S'(C) = C. 



3 Filters and Sorters 

We now define the basic components of our strategy language: filters to select 
specific parts of a constraint, and sorters to classify the elements of a list w.r.t. 
a given ordering. 

We introduce the notion of filter for two main reasons. A solver can, in 
general, be tried on several parts of a constraint [5] . Second, when dealing with 
solver collaborations, in general, a single solver is not able to treat the complete 
constraint [12]. In both cases, we want to identify the sub-parts of the constraint 
that the solver is actually able to handle. Once we have identified these parts, 
we generally want to choose some of them based on a given criterion^. Thus, we 
introduce the notion of sorter that is associated to a notion of strategy. 

We consider that the equality = is purely syntactic. Thus, we say that C' is 
a syntactical form of C, denoted by C' « C, if C' = C modulo the associativity 
and commutativity of A and V, and the distributivity of A on V and of V on 
A. In other words, a filter returns an equivalent constraint when we block the 
associative, commutative, and distributive properties of the operators. 

We denote by ST{C) the finite set of all the syntactical forms of a constraint 
C\ ST{C) = {C"| C « C}^. We say that C" G £ is a sub-constraint of C, denoted 
by C^c], if: 

^ Minimum Domain criterion, for example, when dealing with hnite domains. 

^ The ACD theory dehnes a hnite set of quotient classes that we can effectively filter. 




A Control Language for Designing Constraint Solvers 405 



— 3C*i,C2 G G {A,V}, C = C*itJiC*^W2C*2 

— or 3Ci E C, 00 £ {a, V}, C = C\ooC' 

— or 3 C 1 G £, w G {a, V}, C = C'ooCi 

— or C = C 



A couple (C", C) such that C" is a sub-constraint of C and C" « C is called 
an applicant of C. We denote by CA the set of all the lists of applicants, and by 
CC the set of all the lists of constraints. Generally, we will use LA to denote a 
list of applicants, and LC to denote a list of constraints. We denote by V{Lx C) 
the power-set of all the sets of couples of constraints. Finally, Atom{C) denotes 
the set of atomic constraints that occur in C: {c\c G £AtandC'[c]}. 

Definition 3 (Filter). Let (A,2?, V, £) he a constraint system. Then, a filter 
(f on {U,T>,V,C) is a computable function (j) ■. C ^ V{£ x C) such that: 

VC G c, <p{c) = {(cr, c*), . . . , (cr, c^} 



where: 

— Vz G [1, n], C « C* (C^ is a syntactical form of C), 

— Vz G [l,rz], 

The elements of 4>{C) are called candidates. Given the filters 4> and on 
{S,T>,V,C), we say that: 

— (j) is selective if VC G £, <^(C) = {(C/i, Ci), . . . , {Cfn, C„)} such that Vz, j G 
[1, . . . , rz] X [1, . . . ,n],i j, Atom{C fi) D Atom{Cfj) = 0. 

— 0 is stable if VC G £, <p{C) = {(C/i, C'), . . . , (C/„, C')} 

— (f) and (f are disjoint if VC G £, fiC) = {(C/i, Ci), . . . , (C/„, C^)}, and 
^\C) = {(C/(,C(),...,(C/;;,C(„)}, s.t. V(z,j) G [l,...,zz] X [l,...,m], 
Atom(C fi) n Atomic f'j) = 0. 



Property 1. Let and (f )2 be two filters on (A, T>, V, C). Then, (f>\] 4>2 defined by 

VCG£,()^i;()^2(C) = (Ai(C)n().2(C) 

is a filter on (A, T>, V, £). 



Example 1. Consider the constraint system (A, 2?, V,C) s.t. the predicate G is in 
A, and that C contains some domain constraints, i.e. X G Dx, where Dx (the 
domain of X) specifies the values that the variable X can take. We now define 
a filter for these domain constraints: 

VC G £, <Ad(C) = {(c, C)|C[c] and 3A G V, c = (A G Dx)} 

The filter is stable and selective. We denote by C.Dom the elements of Cai 
resulting of the application of this filter. We will re-use this notation in other 
examples. 
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Example 2. We now consider patterns of constraints. The utility of this filter 
will be clarified in Section 5. We want to filter sub-constraints that are the 
conjunction of a domain constraint, an atomic constraint, and a conjunction of 
domain constraints, i.e. , an atomic constraint, and all the domain constraints 
of the variables occurring in it. 

VC e £, (pDAcADsiC) C and (I)dacaDs{C) is defined as follows: 

1. Patterns: 

(C",C') e cPoAcADsiC) ^ C" = (X e Dx) 

Ay6Var(c)\{X} ^ 

A c e Lai \ C.D om 

AC' e SF{C) 

AC[^„] 

A X G Var(c) 

2. Context-free: 

((C',Ci) e ^DAcADs(C) a (C',C2) G ^DAcADsiC)) ^ Cl = C 2 

3. Commutative-free: 

(X G Dx A cA C",Ci) G 4>dacaDs{C) 1 
A{XeDx Ac A Oi, C 2 ) G <PdacaDs{C) J ^ ~ ^2 

Item 1 requires that elements of <I>dacaDs{C) have some syntactical proper- 
ties; in Item 2, we do not want to consider several times the same sub- constraints 
issued from different syntactical forms of C; and finally, in Item 3, we specify 
that the ordering of the conjunction of domain constraints is not relevant. 

Item 2 and 3 are not mandatory, but they reduce the number of applicants. 
This definition does not provide uniqueness of the filter. Depending on our needs, 
we can consider (1) adding requirements to define one set of applicants per con- 
straint, (2) removing Item 2 and 3, or (3) selecting one of the set corresponding 
to the definition. 

For example, consider the problem of solving CSPs, and S has being a func- 
tion (or a rewrite rule) which reduces the domain of one variable using one 
constraint. Then, for each constraint of the CSP, and each variable of this con- 
straint, we can consider a possible application of S. 



Definition 4 (Sorter). A sorter Sorter, w.r.t. a partial ordering A, for a eon- 
straint system (X,2?, V,£) is a eomputable funetion Sorter : A x'P{£x£) -a £A 
sueh that V{ {Cfi , , J, . . . , (C/i„ , J } G V{£ x £) : 

1. SorteriA, {(C/u , CiJ, • • • , (C/i„, J}) = [(C/i, Ci), . . . , (C/„, C.)] 

2. V/c G [1, . . . , n], 3ji' G [1, . . . , n], Cfi- = Cfk and Ci- = Ck 

3. VjG[l,...,n-l],C/, AC/,+1 



Remark 1. We consider that a sorter is determinist, i.e., if L is a set of applicants, 
each application of Sorter on L will always return the same list of applicants. 
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Example 3 (MaxDom and MinDom sorters). diDom is an ordering based on the 
width of the domain of domain constraints. For atomic domain constraints, Max- 
Dom and MinDom are straight-forward, but we may need to consider these or- 
derings for more complex constraints (e.g., patterns of constraints issued from fil- 
ters). MaxDom and MinDom use the width of domains of variables. Let X G Dx 
be a domain constraint, we consider the generic function u> which gives the width 
of a domain We define the function width as follows: 

— if c € Cdotti and c = X e Dx, then W(c) = u){Dx), 

— if c e Cai \ then W{c) = -1, 

— if C = c A C" or C = c V C" and c G £,At, then W{C) = W(c). 
diDom is now defined by: 

VC,C" G £, C ^Dom C if W(C) < W{C'). 

The sorter MinDom (respectively MaxDom) is defined using the -<Dom ordering 
(respectively '^Dom, the reverse ordering of diDom)- 

4 The Strategy Language 

Given a solver S', a filter 4> and a partial order A, we now define several appliea- 
tion meehanisms for applying solvers to constraints. We assume that a solver is 
applied only once on a given set of constraints. In the following, we consider given 
a constraint system CS = {X,T>,V,C). Most of the application mechanisms are 
based on the same technique when applied to a constraint C\ 

1. A set SC of candidates is built using the filter 4> on C. 

2. The set SC is sorted using A. We obtain 

LC=[(C/i,Ci),...,(C/„,C.)], 

a sorted list of candidates, where (C/i, Ci) is the “best” constraint w.r.t. A. 

3. The solver S is applied on one/several element of LC . 

4. Sub-constraints modified by S are replaced in their corresponding syntactical 
form of C. 

4.1 Basic Solver Compositions 

The following operators are standard and analogous to function compositions. 
They are used to design solvers with “basic” functions, or create solver collab- 
orations with “complex” solvers. Let R and S be two solvers on (if,2?, V, £). 
Then, VC E C we define: 

® For interval domains, width(Dx) can be the difference between the upper and the 
lower bound. On the other hand, for domains that are sets of elements, the width 
can be defined as the cardinality of the set. In every case, width is a numeric value. 
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— 5 - 0 (C) = c {Identity), 

— S;R{C) = R{S{C)) {solver concatenation), 

— S'^{C) = 5'"'^^; S{C) if n > 0 {solver iteration), 

— S*{C) = S'^{C) such that S'"'+^(C') = S'^{C) {solver fixed-point), 

— {S,R){C) = S{C) or R{C) {solver don’t-care). 



Property 2. S;R, S'”, S*, and {S,R) are solvers. 



4.2 Filtered and Random Applications of a Solver 

We first define two operators to apply a solver on specific components of a 
constraint. The first one takes the component randomly whereas the second one 
selects it with respect to a given criterion. 



Don’t Care Application of a Solver Given a solver S, a filter f, and a 
constraint C, dc{S,(f)) restricts the use of the solver S to one randomly chosen 
sub-constraint of a syntactical form of C (obtained using the filter f): 

VC e £,dc(S',(^)(C) = C' 



where: 

- [{Ch,Ci),...,{CU,C^)] = f{C), 

— if there exists f G [1, . . . , n] such that S{Cfi) f Cf, then C' = Ci{Cfi i— > 
S{Cfi)}, otherwise C = C. 



Best Application of a Solver Given a solver S, a partial order V on £, a 
filter <f), and a constraint G, hest{S,f^,4>) restricts the use of the solver S to 
the best (w.r.t. the partial order V) sub-constraint of a syntactical form of C 
(obtained using the filter f) that S is able to modify: 

VC G £,best(S',:<,0)(C) = C' 



where: 

- [(C/i, Cl), . . . , (C/„, C.)] = Sorter{^,f{C)), 

— if there exists i G [l,...,n], such that S{Cfi) Cfi, and Vj G [l,...,n] 
{S{Cfj) yf Cfj ^ i < j), then C' = Ci{Cfi ^ S{Cfi)}, otherwise C' = C. 



4.3 Concurrent and Parallel Applications of Solvers 

We now define two operators to apply several solvers on a constraint. The first 
one chooses only one result depending on a given criteria. The second one com- 
poses the final result using each application. 
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Concurrent Application of Solvers The operator pcc provides a non-deter- 
ministic choice upon which we act by introducing methods: we do not care about 
which solver actually solved the constraint, but we want the result to verify some 
property. 

A constraint property p on a constraint system {S, T>, V, £) is a function from 
constraints to Booleans (i.e., p : £ ^ Boolean). 

Given a list of solvers [S*!, . . . , 5„], a list of orders on constraints . . . , 
a list of filters [(pi,. , <pn], and a property p, pcc{p, [S*!, . . . , 5„], [^i, • • • , 
],[(pi, . . . , (pn\) applies once one of the solvers Si and returns a constraint that 
verifies the property p: 

yc e £,pcc{p, [S'!, . . . , 5„], [^ 1 , . . . , ^n], [<Pl, ■ ■ -,(pn]){C) = C 

where: 

- for all ie . . . , (C/i,™, , = Sorter{-<i,(pi{C)), 

- if there exists (i,j) G x [l,...,mi] such that p{Si{Cfij)), and 

SiiCfij) yf Cfij, then C = Cij{Cfij >->• Si(Cfij)}, otherwise C = C. 

Parallel Best Applications of Solvers Given a list of solvers [S'!, . . . , 5„], a 
list of orders on constraints [y<i, . . . , y<n], and a list of stable filters [(pi, ... , (pn] 
that are pairwise disjoint, bp([S'i, . . . , 5„], [^i, ■ ■ ■ , din], [<Pi, ■ ■ ■ , <Pn]) applies n 
solvers Si,. .. ,Sn on n sub- parts of one syntactical form of the constraint: 

VC e £, hp{[Sl,. ..,Sn], [Vl, ...,dn],[h,---, <An])(C) = C' 

where: 

- foralHe [l,...,n] [{C f\i,C"), ..., {Cf,,m,,C")] = Sorter 

- for alH € [1, . . . , n], if there exists j G [1, ■ ■ ■ , rm], s.t. Si{Cfi - ) Cfi- , and 
for all k < j, 5'i(C/iJ = Cfi,^, then = {Cfi^i. ^ Si{Cfi^i.)\, else cji = 0, 

- C' = C'V where a = Uig[i, 

4.4 Managing Sub-problems 

Finally, we define two operators to apply a solver on each component of a con- 
junction or disjunction of constraints. The result of the application of these 
operators is obtained by the conjunction or the disjunction of the resulting con- 
straints, respectively. These operators enable parallel computation, and standard 
OR_parallel computation. 

To this end, we introduce the notion of separator that can be seen as a pre- 
processing for parallel computation. Separators are mainly defined to manipulate 
the elements of conjunctions and disjunctions of constraints as elements of lists. 
Each element of the list will then be treated separately but in parallel before 
gathering together (conjunction or disjunction) all the results. 

We define the notion of separator using lists so we can sort and explore 
the search tree in a deterministic way. This is particularly important when we 
consider sequential implementations, i.e., we process the branches sequentially. 
In such cases, the use of sets leads to non deterministic search. 
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Definition 5 (Sepeurators). A A_separator S is a function 6 

V C e £, 3n e N, (^(C) = [Ci, . . . , C„] where C « Ci A . 

Similarly, a V_separator 6 is a function 6 : jC ^ £C such that: 

V C e £, 3n e N, (i(C) = [Ci, . . . , C„] where C « Ci V . 

Conjunctive Sub-problems Consider a solver S, and a A_separator 6. Then, 
A_p(S', d)(C') applies (in parallel) the solver S to several conjuncts (determined 
by 6) of the constraint C and the final result is obtained by conjunction of the 
results computed in parallel: 

VC e £,A_p(S',d)(C) = C' 

where [Ci, . . . , C„] = S{C) and C' = S{Ci) A ... A S'(C„). 

Disjunctive Sub-problems: V_p is analogous to A_p but 6 determines dis- 
juncts, and the final result is the disjunction of the results computed in parallel. 
Given a solver S and a V_separator 6: 

VC e £,V_p(S',d)(C) = C' 

where [Ci, . . . , C„] = S{C) and C' = S'(Ci) V ... V S'(C„). 

4.5 Properties of the Component Functions 

In spite of its simplicity, the following property is essential: it allows one to 
manipulate component functions and solvers at the same level, and thus to create 
solver collaboration with the same strategy language. 

Property 3. best, dc, pcc, bp, A_p, and V_p are solvers. 

5 Examples 

In order to clarify the use of our strategy language, we now specify some well- 
known techniques for dealing with constraints over finite and real domains. 

5.1 Solvers for Constraints over Real Numbers 

We now design solvers for non-linear real constraints using real interval arith- 
metic. In the following, a CSP P is any conjunction of formulae of the form 

/\ {Xi e D,,i )AC 

Xi 

where a domain constraint Xi G D^i is created for each variable Xi occurring in 
the set of constraints C, D^i being an interval of real numbers. Constraints are 
equalities of non-linear polynomials. 



: £ ^ s.t: 

. . A Cn- 

.. VC„. 
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MaxDom Partial Ordering We instanciate the MaxDom sorter of Example 3: 
for all interval I = [a, b], uj{I) ~ b — a. 

The Split Solver We consider the split solver which transforms a domain 
constraint into a disjunction of two domain constraints if the width of the domain 
is greater than or equal to a “minimal” width t split : £ ^ C. For all c = 
X e Dx from £, 

— if c e Coom such that width{c) > e, then 

split{c) = X £ Dx V X e D'x 

where Dx = D'x U D'x 

— otherwise, split{c) = c. 



A Domain Reduction Function Consider the function 6_c which given a 
non-linear constraint c G £.At X^Dom, the domain Dx of a variable X G Var(c), 
and the domains of the other variables of Var(c), returns a smaller domain for 
X such that c is box-consistent [18] with respect to X. Computing b-C gener- 
ally consists in applying the interval Newton method combined with a “local” 
splitting mechanism to push the left and right bounds of the interval. 

We now define the solver drf : £ ^ C. For all C E £, we compute drf{C) 
depending on the syntactical form of C\ 

- \iC = X e Dx Ac A AYeVar{c)\{x} Y £ Dy where c G £At \ £Dom, then 

drf{C) = X £ D'x AcA f\ Y £ Dy 

YeVar(c)\{X} 

where D'x = b_c{c, Dx, {Dy\Y £ Var(c) \ {A}}), 

— otherwise, drf{C) = C. 



Solvers for Non-linear Constraints We now consider the solver box defined 
as follows: 

box = best(dr/, hDom,<pDAcADs)- 

When applied to a CSP C, box executes one step of reduction: one atomic 
constraint of C becomes box-consistent with respect to the largest variable of 
the CSP. Let us now consider the following solver: 

Box = box* . 

Box is the least fixed-point of box. When applied to a CSP Box returns an 
equivalent CSP that is box-consistent (i.e., each constraint is box-consistent 

^ For continuous domains, e generally represents the smallest difference that can be 
computed between two numbers. For discrete domains, e is set to 1. 

® We generally also enforce that D'x n D'x = 0. 
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w.r.t. each of its variables). The strategy of this solver is to always reduce the 
variable with the largest domain, i.e., a well-known and commonly used strategy 
for interval arithmetic. 

Solving CSPs is generally the iteration of two mechanisms: consistency (de- 
scribed above) and splitting. We now describe the splitting mechanism (using the 
((>£) filter defined in Example 1) which enables to extract the isolated solutions: 

Split = hest{split, ^Dom, 4 >d)- 

Applied to a CSP C, Split creates a disjunction of two sub-CSPs. We can now 
give the solver for solving CSPs over non-linear constraints: 

S Full Look Ahead — {BoX., SpUt^ . 

The strategy is a full look ahead: each time a domain is splitted, the consistency 
of the CSP is recomputed. At a lower level, the strategy for applying basic solvers 
is a Max-Dom. The solving process is neither depth-first nor breadth-first but 
Max-Dom first, i.e., we make one reduction step on one branch, and then, we 
eventually choose to explore an other branch. 

We are now concerned with a homogeneous exploration of the branches. We 
consider a V_separator named CSPy. For all C £ C, 



CSPy{C) = [Cu...,Cu] 



such that C « Cl V . . . V Cn and 



Cl = X £ D\^C' 
C2=X £Dj^AC' 



[Cu = X£D^AC'. 



We now get another solver for non-linear CSPs: 



^'fuU Look Ahead ~ -P{BoX] Split, C S Py)* . 



Depending on the implementation of the V_p operator, we will obtain a depth- 
first (sequential implementation) or a parallel (parallel implementation) solving 
process. 



5.2 Solvers for Constraints over Finite Domains 

We exemplify here the use of the strategy language by simulating some heuristics 
widely used for solving CSP over finite domains. The skeleton of the solvers that 
we propose is similar to the one used in the previous example. In the following, 
a CSP P over finite domains is any conjunction of formulae of the form: 

{xi £ D^j) A C 

Xi€X 

where a domain constraint Xi £ D^i is created for each variable Xi occurring 
in the set of constraints C, D^i being a finite set of values. Solving this kind 
of problem can be seen as an interleaving process between local consistency 
verification and enumeration. 
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A Domain Reduction Function The most widely used level of consistency 
verification, Arc- Consistency, can be expressed as the repeated application of 
the following transformation rule that reduces the set of possible values that the 
variables can take. 



Xi G Dr,;. A c A C ^ Xi G RD{xi G D^^c) A c A C 

if RD{xi e Da;;,c) ^ Da;; 

where RD{xi G D^; , c) stands for the set G D^; \ (3ui G 

Dxi ) • • • 5 1 G Dx;_a ) G Dx;a_i , • • • , Vji G Dx;; ) • c{^V\ , . . . , Vi, . . . , Vji) . 

Let us define it more precisely in the same way we did it for the previous 
example. For all C G £, we compute LocalConsistency{C) depending on the 
syntactical form of C: 

- A C = XeDxAcA AYeVar{c)\{x} Y ^ Dy where c G Cai \ ^Dom, then 
Locale onsistency{C) = X £ D'x Ac A ^ Y £ Dy 

YeVar{c)\{X} 



where D'^ = RD{X £ Dx,c), 

— otherwise, LocalConsistency{C) = C. 



The SplitDomain Solver In order to carry out enumeration, we consider a 
solver for splitting domains. We re-use the split solver defined in the previous 
sub-section. This time, e is set to 1, and we enforce D'x H D'x = 0 . 

MaxDom Partial Ordering Finally, we define the MinDom sorter that re- 
turns the domain constraint occurring in the set of constraints C with the min- 
imum set of values. To this end, we just need to instanciate the MinDom sorter 
presented in Example 3: u> is instanciated by the cardinality operator | |. 

Solvers with Usual Strategies In the following paragraphs, we also use the 
filters (pu already defined in Example 1 and 4>dacaDs defined in Example 2. 

FullLookahead This heuristic firstly enforces local consistency and then carries 
out an enumeration step followed by local consistency verification. Local consis- 
tency verification is always carried out on the whole set of constraints: 

FullLookahead = dc{LocalConsistency,(f>£i;;cADs)*] 

{dc{SplitDornain, 4>d)', dc{LocalC onsistency , 4>dacaDs)*)* 

FullLookahead with Minimum Domain This is a simple modification of the first 
heuristic: the enumeration is carried out based on the variable with the minimum 
set of remaining values: 

FullLookaheadMinDom = dc{LocalConsistency,(f)£,/\cADs)*', 

{hest{SplitDomain, ADom,<pD)', 

dc{LocalC onsistency , 4>dacaDs)*)* 
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Forward Checking This heuristic, when enforcing local consistency, takes into 
account just the constraints that are directly related to the enumeration variable. 
We consider another filter 4>dacacaDs'- this filter returns a domain constraint 
D over a variable X, a constraint c that contains X, all the constraints (the 
conjunction C) that contain X (except c), and all the domain constraints of the 
variables that appear in c A C. We also consider an extension SplitDomain’ of 
the solver SplitDomain that applies on the result of the filter 4>dacaC'aDs- When 
applied to a constraint D Ac AC A Ds, SplitDomain’ returns SplitDomain{D) A 
c AC A Ds. We can formulate Forward Checking as follows: 

Forwardchecking = 

dc{LocalCansistency , 4 >dacaDsY] 

{dc{SplitDomain'; dc{LocalConsistency , (PdacaDs)* ,4>dacaCaDs))* 

The enumeration carried out by these strategies applies the solver SplitDo- 
main which adds a disjunction once it is applied. We can imagine that the 
repeated application of this solver could generate a number of disjunctions too 
difficult to deal with. In order to avoid this situation we could use the V_p 
construction defined in Section 4.4. In this way, when we are carrying out an 
enumeration we can really decompose a CSP into two subproblems. The obvi- 
ous advantage is to deal with a simpler problem. The solution to the original 
problem will be in the union of the solutions to all subproblems. 

The modified version of the Full Lookahead heuristics is the following: 

dc{LocalConsistency , 4 >dacaDs)* 

{dc{SplitDornain, 4 >d)] V -p{dc{LocalCansistency , 4 >dacaDs)* , h))* 



6 Conclusions 

We have presented the definition of a strategy language for solving CSPs. A 
key point in this work is the introduction of the concepts of constraint filter, 
separator, and sorter. These operators allow us to show in the strategy language 
the syntactical transformations generally hidden in the current solvers. Then, 
using these operators we have defined a set of constructors that allow to define 
single solvers as well as the collaboration of solvers. 

We have exemplified the use of this language by the simulation of well-known 
techniques for solving non-linear constraints over real domains and CSPs over 
finite domains. 

To show the broad scope of our control language potential applications, we 
have already designed several solvers that are considered of different nature (such 
as Simplex algorithm, propagation based solvers, and Grbbner bases computa- 
tion). We are currently working on the implementation of this language in order 
to evaluate the real applicability of this framework. 
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Abstract. Interval constraint-based solvers are valuable tools to sci- 
entists and engineers since they ensure many useful properties such as 
completeness of the result. However, their lack of soundness is sometimes 
a major flaw. This paper presents an algorithm ensuring soundness by 
computing inner approximations of real relations using only “traditional” 
numerical methods. A slight modification of the algorithm permits han- 
dling constraint systems with one universally quantified variable. An ap- 
plication to declarative modelling of camera movements is also described. 



1 Introduction 

Expressiveness, efficiency, and reliability of interval constraint-based tools [16, 2, 
4] make them a solution of choice for solving non-linear systems of equations such 
as the ones arising in robotics [13], chemistry [11], or electronics [14]. Relying on 
interval arithmetic [12, 1], these tools ensure completeness (all solutions present 
in the input are retained), and permit bracketing solutions with an “arbitrary” 
accuracy. On the other hand, soundness is not guaranteed, while it is sometimes 
a strong requirement. For example, consider a civil engineering problem [15] 
such as floor design where retaining non-solution points may lead to a physically 
unrealizable structure. 

This paper presents an algorithm whose output is a set of sound boxes of vari- 
able domains for some constraint system. Soundness is achieved by computing 
inner approximations of real relations using box consistency [3] — a well-known, 
efficient, local consistency [10] — on the negation of the involved constraints. 
Next, a slight modification of the algorithm is described, which permits solving 
constraint systems where one variable is universally quantified. Its application 
to temporal constraints describing camera movements {virtual cameraman prob- 
lem [9]) is then presented along with some preliminary results. 

The organization of the paper is as follows: Section 2 introduces the basics 
related to interval constraint solving; Section 3 presents an extension of the 
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theoretical framework given in Section 2 to support the notion of inner approxi- 
mation, along with the corresponding new algorithms; Section 4 describes modi- 
fications of the algorithms of Section 3 permitting to consider constraint systems 
with one universally quantified variable; Section 5 discusses the use of the algo- 
rithms for the “virtual cameraman problem” ; finally, Section 6 synthesizes the 
contribution of the paper, and points out future directions for improvement of 
the methods described hereinafter. 

2 Interval Constraint Solving 

Finite representation of numbers by computers hinders exact solving of real 
constraints. Underlying real relations must be approximated by considering one 
of their computer-representable superset or subset. This section presents the 
basics related to the approximation of real relations the conservative way. Ap- 
proximation by a subset is deferred until the next section. The shift from reals 
to floating-point intervals is first described; the notion of outer approximation 
of a real set based on intervals is then presented. 



2.1 Preliminary Notions 

Let K be the set of reals and F C R a finite countable subset of reals correspond- 
ing to floating-point numbers in a given format [8]. Symbol oo is introduced to 
represent infinity, that is: Vg £ F: — oo < g < -boo, and M C (— oo .. -foo). Let 
F°“ = FU {— oo, -|-oo}. Hereafter, r and s (resp. g and h), possibly subscripted, 
are assumed to be elements of M (resp. F“). 

Let £ = {(, [} (resp. U = {),]}) be the set of left (resp. right) braekets. Let 
B = be the set of brackets totally ordered by the ordering -< [6]: ) -< [-<] -< (. 
The set of floating-point bounds F^ is defined from B and F as follows: 



F^ = F^ U F^ where 



rr = (fx£u{(-^,(),(+cx3,()}) 

\r =(FxWu{(-^,)),(-fcx),))}) 



Real bounds set is defined likewise. Floating-point bounds are totally ordered 

by the ordering <]:V/?i = (g,o;i),/?2 = (h, 02) e F^: /?i < 1 /?2 (g < h)V(g = 

h A oi -< q; 2)- A similar ordering may be defined over M^. 

Rounding operations mapping real bounds to float bounds are defined as 
follows: 



Bound downward rounding 


Bound upward rounding 


LUJ : ^ r 


rni : ^ IP 


fl 1 — ^ maxjy £ F^ I T ^ 


p 1 — ^ min{7 £ IP 7 U /?} 



Bounds are used to construct intervals as follows: lo = x F" is the set of 
closed/open floating-point intervals (henceforth referred as intervals), with the 
following notations used as shorthands ( (( g, [ ), ( h, ] )) = [g .. h] = {r G M | g ^ 
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r ^ h}, etc.). For the sake of simplicity, the empty set 0 is uniquely represented 
in lo by the interval (+oo .. — oo). 

In the rest of the paper, a Cartesian product of n intervals B = I\X ■■■ x In 
is called a box. A non-empty interval / = (/?i,/32) with /?i G and /?2 G IP 
is said canonical whenever /? 2 |^, ^ (/?l|^,)+, where f3\y is the numerical part of 
bound p, and g+ is the smallest float greater than g. A n-ary box B is canonical 
whenever the intervals I\, . . . ,In are canonical. Given a variable v, an interval 
I, and boxes B and D, let DomB(i’) G lo be the domain of v in box B, and 
B\v.d the box obtained by replacing v domain in box B by its domain in box 
D. The power set of a set S is written V{S). 



2.2 Approximating a Relation by a Box 

A constraint is an atomic formula involving variables of Vr = {x\,X 2 , . . . }. Given 
a constraint c{x \, . . . , x„), pc denotes the underlying real relation. For the sake 
of readability, relation for some constraint Ci is written pi whenever that 
notation is non-ambiguous. Let c be -ic, that is: p^=W^ \pc- 

A real relation p may be approximated conservatively by the smallest box 
(w.r.t. set inclusion) Outero(p) containing it. 

Discarding values of the variable domains for which a constraint c does not 
hold is done by contracting operators, whose main properties are contractance, 
completeness, and monotonicity. 

The outer- approximation operator OClc is a contracting operator for c that 
tightens variable domains using the Outero approximation: 

Definition 1 (Outer-approximation operator). Let c be a n-ary constraint, 
Pc its underlying relation, and B a box. An outer-approximation operator of c 
is a function OClc: Iq — ^ I” defined, by: OClc(B) = Outero(B n Pc) 

Proposition 1 (Completeness of OCl). Given a constraint c, the following 
relation holds for every box B: {B f) pc) C OClc(B) 

The implementation of outer-approximation operators is easily done only for 
a limited class of constraints (primitives) . The other constraints are solved by 
decomposing them into conjunctions of primitives. In order to overcome the loss 
of domain tightening due to the introduction of new variables by the decom- 
position process, Benhamou et al. [3] defined a new kind of operator (outer-box 
approximation operator OCb) which considers constraints globally. The following 
relation between OCb and OCl does hold: 

Proposition 2 (Completeness of OCb). Given a constraint c, and a box B: 

(B n Pc) C Outero(B n Pc) Q OChc(B). 

Operators OCl and OCb narrow the domains of variables occurring in one 
constraint. Solving constraint systems is done by an algorithm (OC2) which 
computes the greatest common fixed-point included in the initial domains of all 
the contracting operators associated to each constraint (see details in [5]). 
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3 Inner Approximations 

In order to compute only solution sets, the outer-approximation of a relation 
p c R"" is replaced by the inner approximation of p which is the subset of all 
the elements r € M"" for which the statement r G p may be checked using only 
floating-point numbers. 

Definition 2 (Inner approximation of a relation). Given a n-ary relation 
p, the inner approximation of p is defined hy: 

Innero(p) = {r € M” | Outero({r}) C p} 

Proposition 3 (Properties of the Inner approximation). The Inner approx- 
imation is monotone, idempotent, and distributive w.r.t. the union and intersee- 
tion of subsets of M"" . 

The narrowing of variable domains occurring in a constraint is done in the 
same way as in the outer-approximation case: an inner-approximation operator 
associated to each constraint discards from the initial box all the inconsistent 
values. The result is a set of boxes. 

Definition 3 (Inner-approximation operator). Let c be a n-ary constraint, 
and B a box. A inner-approximation operator ofc is a function IClc: Iq ^ ^(lo) 
defined hy: 



IClc(B) C lnnero(S D Pc) 

Proposition 4 (Soundness of ICl). Given a constraint c and a box B, a 
inner- approximation operator IClc for c is such that IClc(S) C (B n pc) 

Proposition 4 is an immediate consequence of Inner and Outer definitions. 

Inner-approximation operators with stronger properties may be defined, pro- 
vided some assumptions — namely the ability to compute the “Outer” for con- 
straint expressions — , are fulfilled. These operators are optimal in the sense 
defined below. 

Definition 4 (Optimal inner-approximation operator). Let c be a n-ary 
constraint, B a box, and IClc an inner-approximation operator for c. IClc is said 
optimal if and only if IClc(S) = lnnero(S n Pc) 

Devising an inner-approximation operator for a constraint is not as easy as 
devising an outer-approximation operator since interval techniques only permit 
to enforce some partial consistencies, that is, values which are discarded are guar- 
anteed to be non-solutions while no information is known about those which are 
kept. Algorithm 1 (ICAlc) implements an optimal inner-approximation operator 
for every n-ary constraint c by using OClc- Since values discarded by this oper- 
ator are guaranteed to be non-solution of c — by completeness of OCl — , they 
are guaranteed solutions for c. 
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Algorithm 1. ICAlc - Inner contracting algorithm for a constraint c 



1 ICAl^(in: B e out : W € 

2 begin 

3 D ^ OCIjt(-B) 

4 U ^ B\D 

5 if {D 7 ^ 0 and -iCanonical(I?)) then 

6 (£>i,£> 2 ) ^ PlainSplit(£>) 

7 W^WUICA1c(-Di)UICA1c(-D2) 

8 endif 

9 return (W) 

10 end 



The PlainSplit function used in the algorithm splits in two intervals one of the non-canonical domains of 
ID. In a typical implementation, each non-canonical domain is chosen in turn in a round-robin fashion at 
each call of PlainSplit. 



Handling constraint systems using inner-approximation operators is done by 
Algorithm 1CA2 (see Algorithm 2): each constraint of the system is considered 
in turn together with the sets of elements verifying all the previously considered 
constraints so far. The main difference between OC2 and 1CA2 lies in that each 
constraint needs only be considered once, since after having been considered for 
the first time, the elements remaining in the variable domains are all solutions of 
the constraint. As a consequence, narrowing some domain later does not require 
additional work. 

Proposition 5 (Property of 1CA2). Let S = {ci,...,Cm} be a set of con- 
straints, and B a box. Then, 1CA2(5, {S}) C lnnero(S n pi n • • • n Pm)- 

Inclusion in Proposition 5 may be replaced by an equality provided the op- 
erators ICl used are all optimal. 



Algorithm 2. 1CA2 - Inner contracting algorithm for ci A • • • A Cm 

1 ICA2(in: S = {ci, . . . ,Cm} C C,A e out : W € 

2 begin 

3 it {S A 0) then 

4 B ^ 0 

5 foreach D (z A do 

6 B ^ BU IClci(£>) 

7 endforeach 

8 return (ICA2(5 \ {ci}, B)) 

9 else 

10 return (A) 

11 endif 

12 end 
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4 Introducing Quantifiers 

Given a n-ary constraint c{x\, . . . , x„) and a box B = I\ x ■■■ X In, applying 
the inner-contracting operator IClc to B gives a set of boxes U = {B'^, . . . , B'^} 
where each B'^ = D\X ■ ■ ■ x Dn is a sub-box of B such that: Vri G D\,. . . , Vr„ G 
Dn ■ c(ri, . . . , r„) does hold. 

Therefore, solving a constraint of the form Mxk- c{x\, . . . ,Xn) consists in 
retaining only boxes B' = [D\ x • • • X Dn) of U such that Dk — Ik- 

Given v the universally quantified variable, Algorithm ICA3c described by 
Algorithm 3 narrows domains of all variables occurring in a constraint c but v, 
and is an optimal inner-approximation operator for the constraint Vu : c. 

An efficient algorithm (ICAbSc) computing an inner-approximation operator 
for constraint c may be derived from Algorithm 3 by replacing OClc by the 
outer-box approximation operator OCbc- Note that optimality is then lost. In 
the same way, replacing IClci by ICAbSc in ICA2 leads to Algorithm ICAb4 
computing an inner approximation for the constraint Vu : Ci A • • • A Cm,. 



Algorithm 3. ICA3c - Inner contracting algorithm for Vu : c 

1 ICA 3 ,(in: B € I",a € Ve; out: W € T(I")) 

2 begin 

3 D ^ OClc(B) 

5 U ^ B \ D\-u,b 

6 if {D A 0 and -iCanonicah (£?)) then 

7 (£>i, £> 2 ) ^ 

8 W ^ WulCA3c(£>i,v)UlCA3c(£>2,a) 

9 endif 

10 return (U) 

11 end 

The Splits function used in the algorithm splits in two intervals one of the non-canonical domains of E>. 
Domain is never considered for splitting. In the same way, Canonical^; tests canonicity for all 

domains but the one of variable v. 



5 An Application to the “Virtual Cameraman Problem” 

Jardillier and Languenou [9] devised the prototype of a declarative modeller 
allowing an artist to specify the movements of a camera using the vocabulary 
proper to the field [panoramic shot, travelling, ... ). The movements description 
is translated into a constraint system where the time t is a universally quantified 
variable. To solve a system of the form Vt : ci A • • • A c^, they use Algorithm EIA4 
which computes an inner approximation by decomposing the initial domain It of 
t into canonical intervals 1^ , ... ,/f , and testing whether ci A • • • A Cm does hold 
for the boxes Ji x ••• x x ••• x Ji x ••• x If x ••• x J„. These evaluations 

give them results in a three-valued logic, namely [true, false, unknown). Boxes 
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labeled true contain only solutions, boxes labeled false contain no solution at 
all, and boxes labeled unknown are split recursively and re-tested until they 
may be asserted true or false, or canonicity is reached. Retained boxes are those 
verifying: Vj G p}: eval^dA Ac„}(fy X • • • X // x • • • X fy) = true. 

We have devised a new modeller, replacing EIA4 by ICAb4. Experimental evi- 
dences show that it is up to 40 times faster than the prototype described in [9] on 
a set of benchmarks. Moreover, ICAb4 splits the explored space in bigger consis- 
tent chunks than EIA4, and avoids losing time splitting extensively non-solution 
areas. Figure 1 compares graphically the splitting sequence for the explored space 
of czrde 2 , 2 ) a collision problem: given points Bi and B 2 moving along circles of 
radius r\ and r 2 , find all the possible locations of a point A such that Bi (resp. 
B 2 ) is always at a distance greater than di (resp. ^ 2 ) from A. Constraints to solve 
are then of the form: Vd € [— 7r,-|-7r]: ^ {n sin(d) — + (r^ cos(d) — y)^ fy di. 

In the figure, the darker the area, the later its exploration was achieved. White 
areas stand for non-solution sets. 




EIA4 ICAb4 

Fig. 1. Comparison of the solutions generation order for cirele 2,2 



6 Conclusion 

Unlike the methods used to deal with universally quantified variables described 
in [7] , the algorithms presented in this paper are purely numerical ones (except 
for the negation of constraints). Since they rely on “traditional” techniques used 
by most of the interval constraint-based solvers, they may benefit from the active 
researches led to speed-up these tools. However, they are for the moment limited 
to only one universally quantified variable while the methods of [7] deal with 
many variables and quantifiers (existential and/or universal). To achieve such a 
generalization is a major direction for future researches. 
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Abstract. In this paper we examine a technology for solving problems 
on graphs in the constraint programming framework called Subdefinite 
Models. We describe in brief the mechanism of constraint propagation 
underlying it. We present in detail the facilities for specification of graph 
problems as subdefinite models. We discuss a class of graph problems 
with emphasizing on ones having not discussed before. 



1 Introduction 

Constraint programming, a popular paradigm in computer science, allows one to 
solve a large class of problems from different fields stated as Constraint Satisfac- 
tion Problems [1]. Subdefinite models apparatus, proposed by Narin’yani [2] and 
developed in our works [3], is a powerful constraint programming framework. 
In the paper we discuss the extension of this framework for solving problems 
on graphs. In section 2 we describe the mechanism of constraint propagation in 
subdefinite models enriched with facilities for representation and processing of 
compound objects like graphs. The third section describes the specification of 
some graph problems in this framework. 



2 Constraint Propagation Based on Subdefinite Models 

In this section we define the notions of constraint satisfaction problem, subdefi- 
nite extension of a domain, filtering of a relation, and describe the algorithm of 
constraint propagation. 



2.1 Constraint Satisfaction Problem 

Definition 1. A Constraint Satisfaction Problem (CSP) is a pair (V, C), where 
— V is a (finite) set o/ variables, eaeh variable v CL V has its domain 
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— C = Um=i ® (finite) set o/ constraints, each constraint c € Cm has m 

arguments Argc : m} ^ V and an m-ary relation over the arguments 

domains. Rc ^ ^ ArgciX) ^ ^ 

A solution of CSP {V, C) is an assignment of a value a„ G to each variable 
V e V such that for all C £ C (let C G Cm) (aArgc(l)> • • • > ^Arga{m)) C Rc- 

Clearly, we do not have a universal algorithm for finding of all solutions of 
given CSP. (In the case of finite domains we deal with an NP-hard problem 
and there are universal algorithms, like ” generate-and-test” or ’’backtracking”, 
for solving CSPs over such domains.) However, there are universal algorithms 
for finding of an ’’approximation” of the set of all solutions of the CSP. The 
algorithms are known as ’’constraint propagation” algorithms. To describe our 
variant of one of such algorithms, firstly, we should define some additional no- 
tions. 

2.2 Subdefinite Extensions 

Definition 2. Given a domain D, its subdefinite extension (SD-extension) is a 
domain ( denoted by *D ) with the following properties: 

— *D is a finite set of subsets of D, 

— 0 and D are elements of *D, 

— if d' and d" belongs to *D, then d' n d" G *D. 

Elements of an SD-extension will be denoted by bold letters. Any subset D' of D 
can be approximated in SD-extension *D as follows: 

app*D{D') = n 

D'Cde*D 



Example 1. Let D be a finite domain. Then we consider *D = 2^, the set of all 
subsets of D, as a subdefinite extension of the domain D. 

The notion of the SD-extension allows one to apply a single constraint prop- 
agation algorithm not just to finite domains, but also to infinite or continuous 
ones. 

Example 2. Let TZ be the set of all real numbers. Consider its finite subset i?o- 
An Ro-bounded interval x = [x, x] (where x>x G i?o U {— oo, +oo}) is defined as 
a set {x G 7?. I X < a; < x}. The set of all i?o-bounded intervals will be denoted 
by TTZ{Ro). It is easy to see that TTZ{Ro) is an SD-extension of TZ. 



Example 3. Let Z be the set of all integer numbers. Then one can build a sub- 
definite extension of Z either as in example 1 (for a finite subset of .E), or as in 
example 2. 
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Example 4- Let Di , . . be a set of domains, and let *Di, . . *Dn be their 
SD-extensions. Consider a compound domain D = Di x . . . x Dn- One can build 
an SD-extension *D of the domain D as follows: 

*D = *Di X . . . X *Dn- 

It satisfies all the conditions from definition 2. Since elements of a subdefinite 
extension are sets, any element d G *D will be considered hereinafter both as 
a tuple, d = (di, . . . , d„), and as a set, d = di x . . . x d„. We hope that this 
notation will not confuse the reader. 

Other examples of subdefinite extensions of different domains can be found 
in [4]. 



2.3 Filtering 

Definition 3. Let Dn be domains, *Di , , *Dn be their SD-extensions, 

and R be a relation over them (i. e. R C Di x . . . x Dn). The filtering function, 

Dr:*DiX ...X*Dn^*DiX ...X *Dn, 

of the relation R in SD-extensions of the domains is defined as follows: 

Dnidi, ...,dn) = app*Dix...x*D„{Rfidi x ...x d„). 

The meaning of the filtering function of the relation R is the following. Let 
di be the set of admissible values of variable Xi (for i = l,...,n) and the 
values of the xi, . . . ,Xn are connected by the relation R. The filtering function 
Dr ’’filters” the set of admissible values for each variable, excluding the values, 
which are known to be incompatible in the sense of the relation R with the values 
of other variables. 

Example 5. Let D \, . . . , Dn be finite domains, *Di = 2^’ (i = 1, . . . , n) be their 
SD-extensions, and R C Di x ... x Dn be a relation over domains. Then 

J'i?(di, . . . , d,i) = (7Ti(i?ndi X ... X d,^), . . . ,7T„(i? n di x ... X d,^)), 

where TTi{X) is the f-th projection of a relation X. 

Example 6. Let TZ be the set of all real numbers, and TTZ{Ro) be the set of 
all i? 0 "bounded intervals (see example 2) for some finite Rq C TZ. Consider the 
relation add C TZ^, where {x, y, z) G add iff a; + y = z. The filtering function of 
add, .?^add’ defined according to definition 3 as follows: 

•^add(^>y>=^) = ([max{x, (z - y)^}, min{x, (z-y)+}], 

[max{y , (z - x) ^ , min{y , (z - x) + }] , 

[max{z , (x + y ) ^ , min{z , (x + y ) + }] ) . 
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Here 



a;+ = min{y G i?o U {— oo, + 00 } | x < y}, 

= max{y e RqU {— 00 , + 00 } \ x > y}. 

It is easy to see that there are effective algorithms for the filtering functions 
of other relations over real numbers (like for ”x * y = z”, or for 

”sin(a;) = y”, etc.) in interval subdefinite extension TTZ{Rq) of TZ. 

Example 1. Consider an arbitrary domain £), and its vector-domain A, 

A = Dx...xD = D'^ 

n 

for some positive integer n, and the domain of all integer numbers Z. Let *D 
be an SD-extension of the domain D, *A be the following SD-extension of the 
compound domain A (the same as in example 4): 



*A=*D X ...x*D = 

n 

and *Z be an SD-extension of Z. Consider the following relation 

index C A x Z X D, 

where (a, i, e) G index iff the i-th element of a vector a is e, i. e. a = (ei, . . . , e„), 
and 6i = e. The filtering function relation index is defined accord- 

ing to definition 3 as follows. For a = (ai, . . . , a„.) G *A, i G *Z, and e G *D, 

■^index^*^’ b ®) (^ > i ® )> 

where 

{ (ai, . . . , ai_i,ai D e,ai+i, . . . ,a„), if i C {1, . . . , n} = {f}, 

0, if in n} = 0, 

a, otherwise 

i n app, 2 ({f I ai n e ^ 0}), 
e n app,£)(ljai). 

*6i 

One can easily extend the previous example to an indexation of a matrix. 
The following example demonstrates this. 

Example 8. Given a domain D, consider its matrix-domain M = for two 
positive integer numbers k and 1. Let .E be a domain of all integer numbers, *D be 
an SD-extension of D, *Z be an SD-extension of Z, and *M = be an SD- 

extension of domain M . If we regard matrix-domain M as a vector-domain with 
dimension n = kl , we can define relation index as in previous example. Consider 
another relation index2 CMxZxZxD, where G index2 iff the 
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element at position (i,j) of a matrix m is equal to e. Therefore, a constraint 
index2(m, i, j, e) is equivalent to three constraints 

{ index(m,p, e), 
add{q,j,p), 



2.4 Subdefinite Models 

Definition 4. Let (V,C) be a CSP. A subdefinite model of CSP (V,C) (where 
V = {ui, . . . , Vn} ) is defined as follows: 

— for the domain Dy of eaeh variable v G V, its SD-extension *Dy is built; 
denote the compound domain Dy^ x . . . x Dy^ by Dy, and its SD-extension 
*Dy^ X ... X *Dy^ by *Dv; 

— for eaeh eonstraint c £ C (let c € Cm), « filtering function 

^ Rc • ^ Argdl) X ... X D ^ X ... X D j^yg^^m) 

of its relation Rc is eonstructed. 

To simplify the notation, instead of Tr^ we eonsider the function 

Tt : *Dv *Dv 

defined as follows. Let Argc(j) = Vi. for j = 1, . . .m, and iFR^{di ^, . . . , d^^) = 
= (eij, . . . Then .F+(di, . . . , d„) = (fi, . . . , f„), where 

^ f di, if i (f {i\, . . . , im}, 

" ~ . otherwise. 

The algorithm of constraint propagation in a subdefinite model of CSP [V, C) 
(where V = {ui, . . . , u„}) is defined as follows. 

Definition 5 (Constraint Propagation Algorithm). 

At the t-th step, denote 

d(*) G *Dv — the vector of subdefinite values of variables V, 

Q(i) c C — the set of active constraints. 

Step 0. Let 

d(0) = {Dym...,DyJ, 

= C. 

Step t + 1. If = 0, then STOP. Otherwise, choose c G and let 
d(t+i) = jr+(d(b), 

\ {c} U {c' G C I (3i) Argfi) = Vj and d® ^ 
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The properties of the constraint propagation algorithm are summarized in 
the following proposition (see [3] for proof). 

Proposition 1. In terms of the previous definition, the following assertions are 
valid: 

1. Constraint propagation algorithm in subdefinite models always terminates. 

The number of its steps is less than L{*Dy), where L{*Dy) is the 

length of the maximal decreasing (with respect to ”Q”) chain of different 
elements of *Dy . 

2. If a = (a„j, . . . , is a solution of CSP (P, C), then a € d*, where d* is the 
vector of subdefinite values of variables V at the last step of the algorithm. 

3 Graph Problems as CSPs 

For considering various kinds of problems on graphs we will discuss the repre- 
sentation of a graph structure in a CSP. Below we briefly redefine the common 
notions of graph theory. 

Definition 6. A directed weighted graph is a pair {V, E), where V is a finite set 
o/ vertices of the graph, and E C V xV x 7?.+ (where 7?.+ is the set of all positive 
real numbers) is a set o/ edges of the graph. An element {i,j,w) G E denotes 
an edge from the vertex i to the vertex j with the weight ta > 0. We suppose 
that there exists at most one edge between two vertices, i. e. if (i,j,w) G E, 
{i',j',w') e E, i = i' , and j = j' , then w = w' . For simplicity, we will suppose 
that V is a subset of natural numbers: V = {1, 2, . . . , m}. 

An undirected weighted graph will be considered here as a kind of directed 
weighted graph G = {V, E), where the relation E is irreflexive and symmetric for 
the first and the second arguments, i. e. if (i,j,w) G E, then {j,i,w) G E, and 
the edge {i, i, w) does not belong to E for any i and w. 

The adjacency matrix of the graph G is a real non-negative matrix M G 
■j^mxm^ An element mij of the matrix M is the weight of the edge from the 
vertex i to the vertex j . If mij = 0 then there is no edge from the vertex i to the 
vertex j in the graph G. Clearly, the adjacency matrix of an undirected graph is 
a symmetric matrix with zeros on the main diagonal. 

A vector of edges of the graph G is a vector of triplets B e {V x V x 7?.+ )"', 
where n is the cardinal number of the set E. An element bk = {ik,jk,u!k) of the 
vector B denotes an edge of the graph G from the vertex ik to the vertex jk with 
the weight Wk > 0. 

3.1 Snbdefinite Graph 

Since we deal with subdefinite values in a CSP, consider the advantages of ap- 
plying the subdefiniteness to the graph representation. 

Let a graph G = (V,E) be represented in a CSP by its adjacency ma- 
trix M. For this reason, we define an SD-extension of the domain as 
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(see example 2) for some finite real subset Rq. In this case, we can 
use subdefinite values (i? 0 “bounded intervals) for the representation of weights 
of edges. This means that we deal with a graph, which has subdefinite edges. If 
the subdefinite weight of an edge contains 0, then this edge can be absent in the 
graph. Otherwise, the edge exists in the graph, but its weight is subdefinite, i. e. 
only partially known. 

Let a graph G = {V, E) be represented in a CSP by its vector of edges. A 
subdefinite domain for the representation of this vector is {*V X *V x 
where {Rq) is the set of all i?o-bounded intervals with lower bounds greater 
than 0. Each edge in the vector has its start and finish vertices that are sub- 
definite. This means that we know only partial information about an edge: we 
don’t know precisely the vertices connected by the edge. Moreover, the weights 
of edges are subdefinite too. 

One can use only defined (precise) values in an adjacency matrix and in a 
vector of edges, of course, and therefore one can deal with fully defined graph. 
However, the possibility of the representation of subdefinite graphs allows one 
to specify and solve a much more broad class of graph problems. Below we 
consider various problems on graphs and their representations as CSPs. Also we 
emphasize problems, which have not been discussed previously in graph theory. 

3.2 A Path in a Graph 

Definition 7. Given a directed weighted graph G = {V, E) and two of its ver- 
tiees i,j G V , a path between i and j, p{i,j)> is a sequence ofvertiees ii,i 2 , ■ ■ ■ ,ii, 
where ii = i, ik = j, and for all k = 1, . . . ,l — 1 there exists an edge in the 
graph G between the vertices ik and ik+i, i- e. there exists w > 0 sueh that 
{ik,ik+i,w) G E. The weight of the path p{i,j) = h,i 2 , ■ ■ ■ ,ii is the sum of 
weights of its edges. 

Consider the specification of a CSP for searching a path in a graph. Let a 
graph G = {V,E) be represented by its adjacency matrix M G We can 

represent a path from a vertex f to a vertex j as a vector of vertices P G V™ . The 
size of the vector is equal to m (the number of vertices of the graph), but only 
first I elements are meaningful. The specification of the problem for searching a 
path in a graph is performed with the use of index and index2 relations discussed 
in the previous section of the paper. The first element of the vector P is equal 
to i: 

index(P, 1, f). 

Each next element of P should be connected with the previous one by an edge. 
This condition is represented by the following set of constraints. Let 

index(P, k, u) 

for some k {I < k < m), then 



u = J or 



index(P, k + 1, v), 
index2(M, u, v, w), 
w > 0. 
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The specification of a path in the graph G represented by its vector of edges 
B E {V X V X 7?,+ )"' is performed similarly, except instead of 

index2(M, u, v, w) 



we specify the constraint 

index(i?, r, {u, v, w)) 

with an integer variable r. 

The constraints above are used to specify a CSP for searching a path in a 
graph. One can easily add to this CSP other constraints expressing the weight 
of a path and find the path with minimal weight (see [5] to learn about the 
constraint propagation algorithm for searching an optimal solution of a CSP). 
Here we want to emphasize that one can solve problem of searching an optimal 
path in a graph with subdefinite (partially known) edges or edges with subdefi- 
nite weights. For example, we have specified and solved the Travelling Salesman 
Problem with subdefinite data using the tools described above. 



3.3 A Spanning Tree of a Graph 

Definition 8. Given an undirected weighted graph G = {V,E), its spanning 
tree is a graph S{G) = {V,E'), where E' C E, and S{G) is a tree (a connected 
acyclic graph). 

There is an equivalent definition of a tree in graph theory. In our terms it 
sounds as follows: an undirected weighted graph G = (V( E) is a tree iff it is 
connected (i. e. there exists a path between each pair of its different vertices), 
and \E\ = 2{\V\ - 1). 

We need two groups of constraints to specify a spanning tree. The first one is 
the condition E' C E. Let graphs G = (V, E) and S(G) = (V, E') be represented 
by their adjacency matrices M and M' respectively. Then the condition may be 
specified as follows: 

=0 or = mij 

for all i, j = 1, . . . , m. 

The second group of constraints is the condition that S{G) has to be a tree. 
Since a tree is a connected graph, we specify the existence of a path between 
each pair of different vertices in S{G). The corresponding group of constraints 
was discussed in the previous subsection. The only we need to specify else is 
the condition \E'\ = 2(m — 1). Clearly, \E'\ is equal to the number of positive 
elements of matrix M' . 

One of the popular graph problems is a building of a spanning tree of a given 
undirected weighted graph with minimal sum of the weights of its edges. We can 
easily specify and solve such problem as a CSP using constraints defined above. 
Moreover, we can solve this problem with additional constraints on degrees of 
vertices of the spanning tree. In our terms, the degree of a vertex i in the graph 
G represented by its adjacency matrix M is equal to the number of positive 
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elements in the i-th row (or, equivalently, in i-th column) of matrix M. Since we 
can deal with subdefinite values, these degrees can be given as integer intervals. 
As far as we know, such kind of graph problems has not been discussed before 
in graph theory. 



4 Experimental Results 

Consider a graph G with the following adjacency matrix M : 
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We have tried to solve a travelling salesman problem for this graph (searching 
a path with minimal weight containing all the vertices of the graph which start 
and finish vertices are the same). To show the performance we have considered 
a set of graphs with sizes less or equal to 12 (by excluding from the adjacency 
matrix last rows and columns). For the same set of graphs we have solved a 
problem of a building of a spanning tree. 

The table below summarizes the performance characteristics measured on 
PC with AMD K6 200 MHz processor and 64 MBytes RAM. 



Problem 


Size 


Time (in sec.) 


Backtracks 


TSP with adjacency matrix 


10 


123 


16479 




11 


932 


108809 




12 


6232 


601248 


TSP with vector of edges 


10 


108 


2025 




11 


618 


11014 




12 


3011 


47078 


Spanning Tree 


7 


161 


222 




8 


2538 


1882 



5 Conclusion 

We have solved all the problems on graphs discussed in this paper using con- 
straint programming environment NeMo+ [6] developed in our Institutes. The 
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obtained results allow us to hope on successful application of proposed techniques 
for solving a large class of graph problems. 

Our future work will aim to extend the class of graph problems and to propose 
new constraint programming techniques for its solving. 

The authors are indebted to Sergei Sannikov and Eugene Rukoleev, who have 
taken part in the coding of some of these problems in NeMo+, and to the referees 
for helpful comments on the paper. 

References 

1. Mayoh, B.: Constraint Programming and Artificial Intelligence: In: Mayoh, B., 
Tyugu, E., Penjaam, J. (eds.): Constraint Programming: Proceedings 1993 NATO 
ASI Parnu, Estonia. Springer- Verlag, Berlin Heidelberg New York (1993) 18-53 

2. Narin’yani, A. S.: Subdefiniteness and Basic Means of Knowledge Representation. 
Computers and Artificial Intelligence, Bratislawa 2 , No. 5 (1983) 443-452 

3. Ushakov, D.: Some Formal Aspects of Subdefinite Models. Preprint Institute of 
Informatics Systems, Novosibirsk 49 (1998) 

4. Telerman, V., Ushakov, D.: Data Types in Subdefinite Models. In: Calmet, J., 
Campbell, J. A., Pfalzgraf, J. (eds.): Artificial Intelligence and Symbolic Mathe- 
matical Computation: Proceedings. Lecture Notes in Computer Science, Vol. 1138. 
Springer- Verlag, Berlin Heidelberg New York (1996) 305-319 

5. Telerman, V., Ushakov, D.: Constraint Satisfaction Techniques for Mathematical 
Programming Problems. In: Proceedings of International Conference on Interval 
Methods and their Application in Global Optimization, INTERVAL’98. Nanjing, 
China (1998) 

6. Telerman, V., Sidorov, V., Ushakov, D.: Problem Solving in the Object-Oriented 
Technological Environment NeMo-\-. In: Bjprner, D., Broy, M., Pottosin, I. V. 
(eds.): Perspectives of System Informatics: Proceedings. Lecture Notes in Com- 
puter Science, Vol. 1181. Springer- Verlag, Berlin Heidelberg New York (1996) 
91-100 




Extensional Set Library for ECL*PS*^* 



Tatyana Yakhno and Evgueni Petrov 

A. P. Ershov Institute of Informatics Systems, 

6, Acad. Lavrentjev pr., Novosibirsk, 630090, Russia, 
{yakhno ,pes}@i is .nsk. su 



Abstract. Extensional Set (XS) library is an extension of ECL*PS® 
which solves set-theoretical constraints over extensional sets containing 
variables with numeric domains. To efficiently process such a class of set 
domains, XS library employs a constraint programming method called 
Subdefinite Computations. Within that framework, a domain representa- 
tion and an approximate unification algorithm are proposed. The abilities 
of the library are illustrated by a geometric application. 



1 Introduction 

Because people usually express their knowledge in an implicit way employing 
partial information, computers need a special knowledge representation in or- 
der to “understand” such partial specifications. Few years ago in the field of 
Constraint Programming (CP), it has been proposed to simply add a control 
mechanism to these specifications provided they are sufficiently formal. 

During recent twenty years Constraint (Logic) Programming has developed 
a number of methods and tools processing numeric data and ranging from arc- 
consistency for finite domains [10] to box-consistency for interval domains [13]. 
However, constraint programming systems which process sets are not very nu- 
merous [14,9,7,4,5,15]. A related research area is program analysis which employs 
sets for automatic inference of various properties of programs [6,2,1]. Finally, in 
the imperative environment, sets are most significantly supported in a language 
SETL [12]. 

With respect to CP classification, Subdefinite (SD) Computations are a con- 
sistency technique [11]. Given a set of constraints, it produces a compact de- 
scription of a set which contains all the solutions to the constraints. In Section 3 
SD computations are described in more details. 

ECL*PS® is a CLP system. It allows users to program constraint satisfaction 
techniques directly at the language level. Our paper discusses these facilities 
(Section 2) and a technique of implementation of SD computations in ECL*PS® 
(Section 1). Section 4 describes how this technique is employed in XS library for 
resolution of constraints over finite extensional sets. Section 7 compares XS li- 
brary against a powerful library for resolution of set constraints within ECL*PS®. 
Section 8 describes a geometric application of XS library. 

* This project is supported by grant 98-06 from Institut Franco-Russe A. M. Liapunov 
d’informatique et de mathematiques appliquees. 



D. Bj0rner, M. Broy, A. Zamulin (Eds.): PSI’99, LNCS 1755, pp. 434-444, 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 
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2 ECL*PS« 

ECL*PS® is an abbreviation for ECRC Common Logic Programming System. It 
is a Prolog-based system whose aim is to serve as a platform for creating various 
extensions of logic programming. ECL*PS® offers two data types, meta-term and 
delayed goal, which significantly simplify this process. Using meta-attributes and 
delayed goals, an application can organize additional information and control 
flows in its own way, independently of Prolog standards. 

A meta-term consists of two or more terms, the first term visible to “every- 
one” , called Prolog value of the meta-term, and the others, called meta- attributes, 
visible only to few tools which convert meta-terms to standard Prolog data and 
vice versa. A meta-term is written like T{namel ;T1 , . . .} where T is its Prolog 
value, T1 is its meta-attribute namel, etc. 

Formally, a delayed goal is a Prolog goal whose execution has been delayed. 
A delayed goal represents an action that should be done in the future. There are 
three major operations with delayed goals: creation, scheduling for execution, 
and execution of all scheduled goals. A delayed goal is written like ’GOAL’(G) 
where G is the goal that has been delayed and ’GOAL’ is a label indicating that 
fact. 

3 SD Computations 

SD computations have been introduced by A. S. Narinyani in early 1980’s and 
are intensely studied by our colleagues from A. P. Ershov Institute of Informatics 
Systems and Russian Research Institute of Artificial Intelligence. 

Let us take some signature without function symbols, with predicate symbols 
{Q, . . .}, variables {x, y, . . .}, constants {a, . . .}, and some interpretation of this 
signature. A symbol and its interpretation are typed identically. 

A constraint is an atomic formula. A constraint satisfaction problem (CSP) 
is a finite set of constraints. A solution to a CSP C is a valuation of the variables 
under which each constraint in C holds. The value a of a variable x is extensible 
to a solution of CSP C, if there is a solution to C which maps x to a. 

Given a CSP C, SD computations produce for each variable x a set of values 
which contains all the values of x extensible to a solution of C . Observing the 
traditions of CP, such a set of values is called a domain of x. A variable and its 
domain are denoted by the same small latin letter. 

SD computations pay much attention to domain representation because it is, 
in fact, a question of effectiveness. Simpler domains are less informative, but on 
the other hand they are processed faster. A domain representation is a function 
(•)’^ which widens an arbitrary domain up to the closest representable one. 

A constraint Q{xy . . .) defines the following transformations of x, y, . . 

X ^Pri{Q n X X y X . . , (1) 

y ^ Pr2(Qna; X y X (2) 

^ In the Cartesian products, a constant a is replaced with {a}*. If a is not finitely 
representable, then {a}* is larger than {a}. 
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which are called calculation functions (Pr^ is projection on i-th coordinate, x 
is Cartesian product). The calculation function (1) reads x, y, . .. and writes x, 
the calculation function (2) reads x, y, . . . and writes y, etc. 

Each CSP defines a network of calculation functions which is similar to net- 
works of constraints proposed by other authors [10]. The network contains nodes 
of two types, variables and calculation functions, and naturally splits into star- 
like segments. The center of each star is a calculation function, and its rays reach 
the variables it reads and writes. 

If the domain of a variable x changes, then the calculation functions that read 
X propagate this change to the neighbours of x. Using the data-driven control 
mechanism, SD computations propagate this wave of domain updates through 
the network of calculation functions until the wave expires. 



Implementation in ECL^'PS®. In what follows we briefly describe how data 
types from Section 2 are applied to implementation of SD computations. Let 
C 3 Q{xy...) denote the CSP to which SD computations are applied. Each 
variable x occurring in C is turned into a meta-term x{sd; varCT^, , Fs)} whose 
meta- attribute sd stores the domain of x (the term Tx) and a list of calculation 
functions reading x (the list Fs). 

Each predicate symbol Q of arity n is associated with an (n + l)-ary Prolog 
predicate compute.q whose intended meaning is 

qCl.Tjj.Ty, . . .) a; = Pri(Q n a; X y X . . .)*, 

q(2,T„;,Ty, . . .) y = Pr 2 (Q n a; X y X . . .)*, ... 

where the terms T^,, Ty, etc. denote the domains of x, y, etc. 

Calculation functions of the form (1), (2), etc. are turned into delayed goals 
’GOAL’ (comput_q(l,x,y, ...)), ’GOAL’ (comput_q(2,x,y , ...)), etc. Figure 1 shows 
an encoding for a calculation function. 



q(l, X, Y, ...):- 



make_suspensioii(q(l , X, Y, ...), 3, F) , 


•/. [1] 


extract (Y, Y_dom, Y_goals) , 


y. [2] 


assignCY, Y_dom, [F I Y_goals] ) , 


y. [3] 


extractCX, X_dom, X_goals) , 


y. [ 2 *n] 


compute_q(l, X_dom, Y_dom, ..., CliaLiiged) , 


y, [2*n+l] 


( var (Changed) -> true ; 


y. [2*n+2] 


assignCX, X_dom, [] ) , 


y. [2*n+3] 


schedule_woken(X_goals) , 


y. [2*n+4] 


wake 


y. [2*n+5] 



). 



Fig. 1. A simplified code of a calculation function 
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4 Brief Introduction to XS Library 

Extensional Set (XS) library is an extension of ECL*PS® which solves set- 
theoretical constraints over extensional sets containing variables with numeric 
domains. The particular constraint solver for numeric data (at present, Interval 
Domain library [16]) is a parameter to XS library. The only requirement of such 
a solver is that it offer access to the bounds of numeric domains and creation of 
numeric domain variables. 

Generally speaking, XS library computes sets of ground tuples. A tuple is 
a term constructed of numbers and numeric domain variables with the help of 
functors (x/n) (n > 1). Each set variable is associated with a set domain. A set 
domain consists of two ground sets of tuples I C u. If this domain is associated 
with a variable x, then I C x C u. 

Each constraint over sets is enclosed in curly braces and states either equality 
or inclusion (for two sets), or membership (for a tuple and a set). A set inside 
such a constraint is specified either by a set domain variable, or by a list of (not 
necessarily ground) tuples, or by an expression built of such variables and lists. 
Besides that, cardinality of a set can occur in constraints over numeric data. 



Equality. Keeping two sets equal, XS library modifies set domains (if at least 
one of the sets is specified by a set domain) and numeric domains (if at least one 
of the sets is specified by a list of non-ground tuples containing numeric domain 
variables) . 

Equality of a set domain variable and a list of tuples is maintained by two 
delayed goals which transfer information between the set domain and numeric 
domains inside the tuples (if any). Enforcing equality of two set domain vari- 
ables, XS library intersects their domains and unifies the variables themselves; 
no delayed goals are generated. 

The following example shows the effect of stating equality of two sets: 

[eclipse 17] : 

i [x(0,l),x(l,2),x(2,4)] = [x(0,X0),x(M,XN),x(2,X2)] }. 

K = 1 
XO = 1 
XN = 2 
X2 = 4 
yes. 

Enforcing equality constraint between two sets fails, if XS library is able to 
determine that the sets are different. For example, the following query fails: 

[eclipse 18]: A setdom [] . . [x(0 , 1) ,x(l ,2)] , { A = [x(N,M) ,x(A,B)] }. 
no (more) solution. 



Inclusion. Likewise keeping two sets equal, keeping set A included into set B 
updates set and numeric domains involved into specification of A, B. Eor example: 
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[eclipse 21]: A setdom [] . . [x(0 , 1) ,x(l ,2) ,x(2 ,4)] , 
■[ [x(0, 1) ,x(l ,X1)] subseteq A 
A = A{ [x(0, 1) ,x(l ,2)] . . [x(0, 1) ,x(l ,2) ,x(2,4)] } 

XI = 2 
yes. 



Set Expressions. A set can be specified by a set expression, e.g. 

[eclipse 22]: { A\/[0,N] = [0,1,2] >, ■[ A /\ [0,1] = [] 

N = 1 

A = A{[2]} 
yes. 



Membership. The fact of presence (absence) of a particular element in a set is 
stated by membership constraint. The constraint { X in A } ({ not X in A }) 
tells XS library “not to let the tuple X out of (into) the set A”. 

Relating a tuple and a set of tuples by membership constraint, one is able to 
model application of a function to an argument as follows: 

appCF, 1, FI):- { x(I, FI) in F }. 

The typical usage of (app/3) is illustrated by the following problem taken 
from [8]): given integer m < n, find such a function / that f{i) = f — 1, if 
i € [m+l,n], and f{i) = /(/(f + 2)), if f € [0,m]. The specification is as follows: 

findalKxCl, FI), (betweenCO, N, 1), FI ** 0 .. N) , Up), 

{ F = Up }, 

foralKl: 0..M, app(F, I) *== app(F, app(F, 1+2))) ), 

forall(I:M+l. .K, app(F, I) *== I-l ). 

For example, if M = 6 and H = 9, then the solution is F{[x(0, 6), x(l, 6), 
x(2, 6), x(3, 6), x(4, 6), x(5, 6), x(6, 6), x(7, 6), x(8, 7), x(9, 8)]}. 
In the general case, XS library spends approximately 0{m? + n) units of time 
on each instance of this problem. 

5 Representation of Set Domains 

Sets are so-called content addressable structures. In an imperative environment, 
data of that kind are usually represented by hash tables which make data having 
specified content accessible in nearly constant time. However, in logic program- 
ming, this approach is likely to be hard to stick to. 

XS library transforms lower and upper bounds of set domains to balanced 
binary trees of ground tuples. Tuples in such a tree are arranged with respect to 
-< defined recursively as follows: 

1. t u, if t, u are numbers, and t < u, 

2. t -< M, if t is a number, u is a tuple, 

3. t u, if t, u are tuples, and t is shorter than u, 




Extensional Set Library for ECL®PS® 439 



4. t u, if t =x (. . .ti. . u =x (. . .Ui . . .) are of the same length, and ti = Ui 

for i e [l,k - 1], tk -< Uk- 

The order -< agrees well with unification in the following sense. Let t be a 
non-ground tuple. Let I (respectively u) be the ground tuple obtained from t by 
replacing each variable u in t with the lower (respectively upper) bound of the 
domain of v. It is easy to see that, if a -< 1 or u -< a for some tuple a, then 
unification of a and t will fail. 

Such a representation of set domains is advantageous twofold. First, because 
lower and upper bounds of domains are sorted, all operations on set domains 
take linear amount of time (with respect to the sum of sizes of involved upper 
bounds). Second, because -< and unification agree, retrieving from a set all the 
instances of a non-ground tuple usually requires scanning only small part of 
the set. For example, if X is a constant, Y is a variable and S is the rectangle 
[0,K] X [0,7], then XS library enforces the constraint { x(X, Y) in S} in 0.03, 
0.05, and 0.07 seconds for M=400, 800, and 1600. 

6 Approximate Set Unification 

The Set Unification problem is stated as follows: given two sets of terms (of some 
signature), find a substitution which makes the sets identical. The Set Unification 
problem has been proved NP-complete. Besides that, even if two sets are uni- 
fyable, their most general unifier sometimes is not enough “informative”, e.g, the 
most general unifier of {0, 1} and {x,y} is the identity substitution {xjx,yjy} 
which bypasses the fact of unification. In order to be efficient, (constraint) logic 
programming systems usually restrict the class of processed sets [14,9,7,5]. 

XS library reduces unification of sets, each specified by a list of tuples, to 
unification of set domain variables which are related to these lists by special 
predicates (st/2) and (ts/2). The predicates approximate calculation functions 
of the following relation a between finite sets and lists: 

a = {(s, 01 s = {ti, . . .,tn},l = [ti ,. . .,tn],ti are ground tuples (i € [l,n])}. 

Though precisely computing the calculation functions of a seems to be in- 
tractable, some larger domains can be computed efficiently. Suppose, the lower 
and upper bounds of the set domain are I, u, and for each i e [l,n] the tuple U 
is ground iff i < /c. In the notation each non-ground tuple is treated as the set 
of its ground instances. 

Current version of XS library recomputes the lower and upper bounds I, u 
of the set domain as follows (the predicate (ts/2)): 

- I ^lU{ti,...,tk}, 

- u^un {{ti, ...,tk}'J Ur=fe+i 

- if jl| = n, then u ^ 1. 



That procedure may compute a larger set domain than the corresponding cal- 
culation function would do. For example, if n = 4, ti = 1, t 2 = 99, ts = 99, t 4 is 
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a variable with the domain [0, 9], I = {9}, u = [0, 9] U {99}, then, after a call to 
(ts/2), I will be {1,9, 99} and u will be unchanged. It is easy to check, that the 
true calculation function will set I and u to {1, 9, 99}. 

The non-ground tuples ti {i <E [/c + l,n]) are iteratively recomputed according 
to the following rules (the predicate (st/2)): 

— if a e I, a e for a unique ia & [k + 1, n], then ^ a, 
ti ^ ti Pi tx, 

— if 1^1 = n, i e [l,n], i ^ j, ti is ground, then tj ^ tj \ {ti}. 

The computations stop when the tuples stop changing. And again, the above 
procedure may compute, for some variable, a larger numeric domain than the 
corresponding calculation function would do. This is a reasonable price for effi- 
ciency of the procedure. 

7 Comparison to a Library Conjunto 

Conjunto is a powerful library for resolution of set-related problems within 
ECL*PS®. Conjunto processes finite set domains and constraints stating inclu- 
sion, disjointness, equality for sets, membership for sets and arbitrary terms, 
cardinality for sets and integers. The basic algorithm employed by Conjunto is 
constraint propagation adjusted for set domains represented by lower and upper 
bounds w.r.t. inclusion. 

If speaking of purely set-theoretical problems, XS library and Conjunto offer 
basically the same facilities and performance. However, XS library solves large 
instances of some problems several times faster than Conjunto. For example, 
computing the set Pn of all prime numbers between 1 and N takes (in seconds, 
on Pentium, lOOMHz, under Linux): 



N 


500 


1000 


2000 


4000 


8000 


Conjunto 


1.27 


3.71 


11.42 


39.80 


152.47 


increase 




2.92 


3.07 


3.48 


3.83 


XS 


1.80 


4.43 


10.53 


23.89 


55.81 


increase 




2.46 


2.37 


2.26 


2.33 



The constraints specifying Pn are as follows {ttn is 96, 169, 304, 551, 1008 for 
X=500, 1000, 2000, 4000, 8000): 

N/i 

Pn C [1,X], card(Pjv) = tvn, A 

i=2 j = i 

XS library works faster and time it consumes grows slower because it sorts the 
elements of each set domain it processes. The following example with exclusion 
shows how remarkably this effort is rewarded. Conjunto spends 0.08, 0.15, 0.30 
seconds on exclusion of an element from the set [1,A^] for N =8000, 16000, 
32000. XS library spends 0.01 seconds on each exclusion independently of N. 
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XS library and Conjunto cooperate with numerical solvers, but XS library 
does it more intensely. For example, both libraries allow specifying the cardi- 
nality of a set by a numeric domain variable and then update the set domain 
accordingly to updates to the numeric domain. 

In general, XS library cooperates with a numerical solver in a more compli- 
cated way than Conjunto. For example, given a set S, the following mixture of 
constraints states that A, B, C belong to S and are sorted. 

AeS, BeS, C e S, A<B, B < C. 

For each set S made up of three integers, these constraints have a unique solution. 

If S' C {1,9,99}, A, B, C are between 1 and 99, then Conjunto and Finite 
Domains library output {} .. {1, 9, 99} for S, 1 .. 97 for A, 2 .. 98 for B, 3 .. 99 for 
C. Unlike that, XS and Interval Domain library jointly produce S = {1,9,99}, 
A=l, B = 9, C = 99. 

XS library behaves wiser because it accesses not only set domains. For exam- 
ple, if the calculation function of ^ G S computing A* is invoked for ^4* = [1, 97], 
S* = [0, {1, 9, 99}], then it outputs A* = [1,9]. Clearly, that is correct, because 
in no case 10, 11, and other large numbers from A* belong to S. By analogy, 
B* = [2,98] and C* = [3,99] shrink to a singleton {9} and an interval [9,99]. 

Because the domain of i? is a singleton {9}, the calculation function of i? G S 
computing S adds 9 to the lower bound of S, and S* becomes [{9}, {1, 9, 99}]. 
Continuing this process, we arrive to the results output by XS library. Sorting 
n integral numbers in the worst case, XS library spends O(n^) units of time. 

8 Full Minimum Steiner Trees 

We turn to Minimum Steiner Tree (MST) problem because it consists of non- 
trivial numeric and combinatorial parts. The problem is stated as follows. Given 
a set R of required vertices, find the shortest tree among trees spanning i? US', S' 
being any set of (Steiner) vertices. The sets R, S are subsets of Euclidean plane, 
R is finite. Finding the MST is an NP-complete problem [3]. 

We focus on finding the MST among trees spanning i? U S, S having cardi- 
nality \R\ —2, and call it a full MST. Let R and S be sets of required and Steiner 
vertices. The leaves and inner vertices of the full MST form respectively R and 
S. Each inner vertex is incident to 3 edges which meet at the angle of 7 t/ 3. Thus 
a full MST is a binary tree with an extra vertex attached to its root. 

Let R = {pi,...,pk}, S = {pfe+i, . . . ,P2fe-2}, Pi = [xi,yi). The topology 
of the full MST is specified by finite sets {(L^i)}?=fe+i {(bu)}^=)r+i of 

arcs mapping each inner vertex to its left and right children. Because trees are 
acyclic, the sets {kjftk+i right children are disjoint, 

each containing exactly k — 2 elements. Note that points in S can be numbered 
so that, for a\\ i, k < i, ri < i. 

The topology of the full MST meets the following mixture of constraints over 
set-theoretical and numeric data. 
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findalKxd, _) , between(K+l, 2*K-2, I), L), 
findalKxd, _) , between(K+l, 2*K-2, I), R) , 
term_variables(L, Lchilds) , 
term_variables(R, Rchilds) , 
forall(I:K+l. ,2*K-2, app(L, I) *=< I-l) , 
forall(I:K+l. ,2*K-2, app(R, I) *=< I-l), 

■[ Lchilds /\ Rchilds = [] }, 

# Lchilds *== K-2, # Rchilds *== K-2, 

Functions L, R map a vertex to its children. Lists Lchilds, Rchilds specify the 
sets of respective children. 

For each arc let [aj,pj) be the polar coordinates of pj w.r.t pi. Then, 

for each inner vertex, there hold the following constraints. 

forall(I:K+l. .2*K-2, ( 

app(X, app(L,D) *== app(X,I)+app(Rh, app(L , I) )*cos (appCAl , app(L, I) ) ) , 
app(Y, app(L,D) *== app(Y,I)+app(Rh, app(L , I) )*sin(app(Al , app(L, I) ) ) , 
app(X, app(R,D) *== app(X,I)+app(Rh, app(R, I) )*cos (appCAl , app(R, I) ) ) , 
app(Y, app(R,D) *== app(Y,I)+app(Rh, app(R, I) )*sin(app(Al , app(R, I) ) ) , 
appCAl, app(L, I) ) *== appCAl, I)+pi/3, 
appCAl,appCR,D) *== appCAl, I) -pi/3 

)) 

Functions X, Y map a vertex to its coordinates; functions Al, Rh map a vertex 
to its polar coordinates w.r.t. its ancestor. Choose p 2 k -2 and pi to be the root 
and the extra vertex attached to it. That gives the last two constraints. 

appCX, 2*K-2) *== appCX, l)+appCRh, 2*K-2) *cos CappCAl , 2*K-2) ) , 
appCY, 2*K-2) *== appCY, l)+appCRh, 2*K-2) *sinCappCAl , 2*K-2) ) 

The constraints describing the topology and coordinates of Steiner vertices 
define a space of feasible trees which can be explored by some search algorithm in 
order to find the full MST. Figure 2 shows an example of the full MST computed 
by XS library for |i?| = 16. 






Fig. 2. A Full Minimum Steiner Tree 



9 Conclusion 

Our paper is an introduction to Extensional Set (XS) library for logic program- 
ming system ECL*PS®. XS solves set-theoretical constraints over extensional sets 
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containing numeric domain variables within a framework of Subdefinite Compu- 
tations by A. Narin’yani. 

Because, processing numeric data in set specifications, XS library cooperates 
with an external solver, it is, to a certain degree, a generic constraint solver. 
Such an approach seems more advantageous and reasonable than equipping XS 
with its own solver for numeric constraint satisfaction problems. 

Intrinsic complexity of Set Unification is the reason that restricts the class of 
tractable sets in many constraint logic programming systems. Compared against 
Conjunto, Godel, {log^, XS library efficiently manages a reacher class of exten- 
sional specifications, allowing for numeric domain variables. Instead of “thor- 
ough” algorithm of set unification, XS library uses a fast approximate one. 

In the future, XS library will develop toward efficient low-level implementa- 
tion of operations on set domains and implementation of higher-level operartions 
oriented to resolution of problems from combinatorial geometry. 

We are cordially grateful to Natalya Cheremnykh and Olga Drobyshevich, 
Alexander Zamulin, Yury Zagorul’ko from A. P. Ershov Institute of Informatics 
Systems, Tamara Kashevarova, Alexander Narin’yani from Russian Research 
Institute of Artificial Intelligence for invaluable comments and discussion. 

References 

1. Alexander Aiken and Edward L. Wimmers. Solving systems of set constraints. In 
IEEE Symp. on Logic in Comput. Sci., June 1992. 

2. L. Bachmair, H. Ganzinger, and U. Waldmann. Set constraints are the monadic 
class. In Proc. of the LICS’93, 1993. 

3. Nicolos Christofides. Graph theory: an Algorithmic Approach. Management Sci- 
ence. Academic Press, Imperial College, London, 1975. 

4. Agustino Dovier and G. Rossi. Embedding extensional finite sets in CLP. In Proc. 
3rd Int. Logic Programming Symp., Vancouver, Canada, 1993. 

5. Carmen Gervet. Conjunto: Constraint logic programming with finite set domains. 
In M. Bruynooghe, editor, ILPS’Qf: Proc. fth Int. Logic Programming Symp., pages 
339-358, 1994. 

6. N. Heintze and J. Jaffar. A decision procedure for a class of set constraints. In 
IEEE Symp. on Logic in Comput. Sci., July 1991. 

7. P. M. Hill and J. W. LLoyd. The Godel programming language (CSTR 92-27). 
Bristol University, 1992. 

8. S. V. Konyagin, G. A. Tonoyan, I. F. Sharygin, I. A. Kopylov, M. B. Cevryuk, 
M. L. Sitnikov, O. A. Baiborodin, V. P. Burichenko, G. V. Golovin, D. O. Orlov, 
L. B. Parnovski, T. A. Sokova, I. V. Stetsenko, V. V. Titenko, and S. A. Filippov. 
International Mathematical Contests. Moskva: Nauka, 1987. 

9. Bruno Legeard and E. Legros. Short overview of the CLPS system. In Proc. 
PLILP’91, Passau, Germany, August 1991. 

10. Alan K. Mackworth. Consistency in networks of relations. Artificial Intelligence, 
8(1):99-118, 1977. 

11. Alexander S. Narinyani. Subdefiniteness and basic means of knowledge represen- 
tation. Computers and Artificial Intelligence, 2(5):443-452, 1983. 




444 Tatyana Yakhno and Evgueni Petrov 



12. J.T. Schwartz, R.B.K. Dewar, E. Dubinsky, and E. Schonberg. Programming with 
Sets. An Introduction to SETL. Texts and Monographs in Computer Science. 
Springer- Verlag, 1986. 

13. Pascal Van Hentenryck, Laurent Michel, and Yves Deville. Numerica: a Modelling 
Language for Global Optimization. The MIT Press, Cambridge, MA, 1997. 

14. Clifford Walinsky. CLP (A*): Constraint logic programming with regular sets. In 
Giorgio Levi and Maurizio Martelli, editors, Proe. 6th Int. Conf. on Logie Program- 
ming, pages 181-196, Lisbon, Portugal, June 1989. The MIT Press. 

15. Tatyana M. Yakhno and Evgueni S. Petrov. LogiCalc: integrating constraint 
programming and subdefinite models. In Praetieal Application of Constraint Tech- 
nology, pages 357-372, Westminster Central Hall, London, UK, April 1996. 

16. Tatyana M. Yakhno, Vyatcheslav Z. Zilberfaine, and Evgueni S. Petrov. Appli- 
cations of ECL*PS®: Interval Domain library. The ICL Systems Journal, pages 
35-50, November 1997. 




Introducing Mutual Exclusion in Esterel* 



Klaus Schneider and Viktor Sabelfeld 



University of Karlsruhe, Department of Computer Science 
Institute for Computer Design and Fault Tolerance (Prof. Dr.-Ing. D. Schmid) 
P.O. Box 6980, 76128 Karlsruhe, Germany 
{Klaus . Schneider .Viktor . Sabelf eld}@inf ormatik. uni-karlsruhe . de 
http; //goethe . ira.uka.de/ 



Abstract. We show how the synchronous programming language Es- 
terel can be extended by a new statement to implement mutual exclusive 
code sections. We also show how the thereby extended Esterel language 
can be translated back to standard Esterel and we prove the correctness 
of this transformation. Additionally, we show that the translation fits 
well into different verification approaches. 



1 Introduction 

Synchronous languages like Esterel [1 ,3] allow to describe multithreaded systems 
where the threads run in a synchronous manner. The synchronization of threads 
is for free since it is achieved directly by the semantics of the language: Most 
of the statements of synchronous languages do not consume time. Instead, con- 
sumption of time must be explicitely enforced by special statements, as e.g. the 
pause statement of Esterel. As it is only possible to consume a multiple of a 
logical unit of time, all threads of a system run synchronously to each other ^ . 

There exist techniques to translate a multithreaded Esterel program into 
a single-threaded program [2] such that it can be translated into standard se- 
quential programming languages like C. Therefore, Esterel designs can be con- 
veniently translated to software parts of embedded systems. Moreover, there are 
techniques to directly map Esterel designs to register-transfer circuits [2] . It has 
been shown that the results of this hardware synthesis are almost optimal [8,9] 
such that additional optimizations are usually not necessary. For this reason, 
Esterel can also be used as a good basis for hardware synthesis. 

To summarize, Esterel can be used as basis for hardware-software codesign 
where Esterel allows to describe the system independent of the later realization in 
hardware or software. Hence, Esterel is a good language for designing the digital 
part of embedded systems. However, from the viewpoint of software engineers, 
the communication mechanisms provided by Esterel are rather poor: the only 

* This work has been financed by the DFG priority program ‘Design and Design 
Methodology of Embedded Systems’. 

Note however that the real amount between different synchronization points of time 
may differ, i.e. the synchronization points need not be equidistant. 
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way for threads to communicate with each other is to broadcast globally visible 
signals. Instead, software engineers are often used to implement communication 
via shared memory. Clearly, this presupposes that we have critical sections of 
code that are executed in a mutually exclusive manner. 

It is not surprising that a lot of different communication principles can be 
implemented with the basic broadcasting principle provided by Esterel. In par- 
ticular, the communication over shared variables is, of course, possible. The 
problem is, however, that the mutual access to these variables must be guar- 
anteed by the programmer, since there are no semaphore constructs in Esterel. 
Nevertheless, these can be implemented in Esterel, but we feel that especially 
the imitation of mutual exclusion is an error prone task. 

The mutual exclusion problem was first formulated by Dijkstra [4]: One con- 
siders n (n > 2) processes that communicate with each other through shared 
variables. The processes have critical and noncritical code sections. The solution 
of the ‘mutual exclusion problem’ must satisfy the mutual exclusion property: 
avoid simultaneous execution of critical sections in two or more processes. There- 
fore, at any time one or more processes wish to execute their critical sections, one 
of them is selected. While all others are suspended, the chosen one executes its 
critical section. In addition, the following fairness eondition must be satisfied: 
each critical section that can be executed will not be ignored infinitely many 
times. We say a solution of the ‘critical section problem’ is safe iff it fulfills the 
mutual exclusion property, and it is called to be fair iff it fulfills the fairness 
property. 

In 1968, Dijkstra described [5] a safe and fair solution for two processes. 
Lamport [6] presented in 1974 the correct solution for n processes, called the 
bakery algorithm. This algorithm uses unbounded counters, and can therefore 
not be implemented by a finite state machine. The first finite state solution for 
n processes was described by Peterson [7] in 1983. 

In principle, we could choose Peterson’s algorithm for the solution of our 
problem. We preferred however another solution since this allowed us to sepa- 
rate the mutual exclusion problem from the remaining program statements: To 
implement the mutual exclusion, we introduce an explicit arbitration process 
that schedules the different critical sections that could be executed next. It is 
important that the arbitration is safe, i.e. at each point of time, at most one 
process is granted access to the critical section, and fair. As it is not straight- 
forward to implement such an arbitration process, we have extended the Esterel 
language by a new region statement for establishing critical sections. 

To implement the arbitration process, we use a modification of the DMA 
arbitration controller given by Martin [10]. This modification is finite state: For n 
processes, we obtain 4n + 2 boolean valued signals, where only 2n of these signals 
are state variables. The state number grows proportional to 0(n2"'), but what is 
more important: The representation by OBDDs for symbolic model checking is 
polynomial [10], so that this implementation lends itself well for verifying such 
programs by symbolic model checking. We show in this paper how programs 
with the new region statement can be translated to standard Esterel programs 
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and we prove the correctness of this translation. The translation involves mainly 
the parallel execution of a fair arbitration process and interfacing the critical 
sections with a simple protocol. It is our aim to develop a translation that leads 
to a simple verification afterwards. In terms of model checking, this means that 
the arbitration process should have a good BDD representation. Hence, we avoid 
the usage of queues or other higher order data types. 

The paper is organized as follows: in the next section, we present the syntax 
and the intuitive semantics of our new statement for establishing critical sec- 
tions. We also present the basics of the translation of these programs back to 
standard Esterel. After that, we prove the correctness of the translation. This 
is done twofold: on the one hand, we prove the correctness by means of model 
checking techniques. This shows that our arbitration process has a good BDD 
representation such that Esterel programs with critical sections can be directly 
verified by model checking techniques. On the other hand, we prove the correct- 
ness by a paper-and-pencil proof that leads to an interactive proof rule that can 
be used to eliminate region statements for proving a given specification. Finally, 
we discuss a syntactic strategy for avoiding deadlocks in our extended Esterel 
programs. 

2 Extending Esterel by Mntnal Exclusion 

To express mutual exclusion, we extend the Esterel language by a region state- 
ment designed for declaring critical program sections that can only be executed 
exclusively from each other. The syntax of the region statement is as follows, 
where ident is a name and statement is an arbitrary (extended) Esterel state- 
ment: 



region ident statement end region 

We say that the region statement region A S end region belongs to the region 
A and consists of the body S. A program can contain many region statements 
belonging to the same region. The meaning of the statement is as follows: If some 
region statements region A Si end region for i = 1, ... ,k are to be executed 
in parallel, only one body Sj of the region statements is chosen for execution 
while the remaining statements have to wait. The body Sj of this selected region 
statement is then executed, while all other region statements are suspended until 
the execution of Sj terminates. After termination of Sj a new choice among the 
remaining region statements is made and so on.^ Hence, at each point of time, at 
most one body S'j , 1 < j < n of a region statement belonging to the region A can 
be active (mutual exclusion) . Note that execution of the body Sj of the selected 
region statement starts at the same point of time where the region statement is 
executed, i.e., entering the region statement and the arbitration do not consume 
time. 

^ To avoid obvious deadlocks, we forbid nested region statements that belong to the 
same region. 
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It is important that the access to the region A is fair, i.e. if a region statement 
region A Sj end region is started, then we guarantee that its body Sj will 
be executed after some time. In other words, we avoid that one of the region 
statements must wait forever and is never allowed to execute its body. Clearly, to 
assure this, we must assume that all bodies Sj of the region statements terminate 
in each case. For example, suppose we have k stores storci, for i = 1, . . . , fc, three 
modules Producci, Consumci, and Duplicata for alH = 1, . . . , fc, and wish to 
implement the mutual excluded access to the stores storci, i = 1, . . . , fc. To this 
end, we can use now the following region statements with identifiers Ai, . . . , Ak 
and run them in parallel: 

— region A{ ProducCi end region 

— region Ai Consumci end region 

— region Ai DuplicatCi end region 

Although Esterel does not directly provide statements for mutual exclusive exe- 
cution of threads, the Esterel statements are powerful enough to implement such 
a behavior. To see this, we show now how our region statements can be translated 
to standard Esterel: Let i?i=region A Si end region for i = 1, ... ,n are all 
the region statements belonging to the region A in an extended Esterel program 
S{Ri, . . . , Rn). Then, we replace S{Ri, . . . , Rn) by the following statement: 



"trap trm in 


1 


signal £»i, . . . , Qn, /a, m, . . . , a„ in 




weak abort 


<S(Pi, . . . , P„); exit trm 




sustain Qi 


II 


where Pi = 


when immediate ai; 


arbitratcAigi, ■ ■■,Qn, /a, «i, . . . ,«„) 




Si; 


end signal 




emit /a 


.end trap 


1 



The statement exit trm is used to leave the entire statement in case S termi- 
nates. The statement Pi behaves as follows: Eirstly, the wish of ‘region A Si 
end region’ to access the critical region A is signaled by emitting the request 
signal Qi. The additional Esterel thread arbitrate a{qi, • • • , Qn, Ja, cki, • • •, ctn) 
collects all these requests and decides which one of the region statements is al- 
lowed to enter the critical section. This decision is broadcasted via the signal 
ai which allows the statement Ri to enter the critical section. After that. Si is 
executed and no further grants are given by the arbitration thread before Si ter- 
minates. The termination of Si is signaled in Pi by emitting the release signal Ja 
of region A which indicates that Ri currently leaves the region A. This instructs 
the arbitration thread to make new choices and emit new access signals aj . 

The arbitration thread can immediately select one of the regions and hence, 
the emission of Qi can be immediately aborted in Pi. The abortion is however 
weak which means that even if Pi is immediately selected, there will be an 
emission of Qi for at least one point of time. Note further that the request signal 
Qi is emitted as long as Pi is not allowed to enter its critical section Si. 
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The above replacement of the region statements by standard Esterel state- 
ments is straightforward and the code size remains more or less the same. How- 
ever, the correctness of the replacement is based on a correct Esterel implementa- 
tion of the arbitration process. Hence, the correctness of the translation depends 
clearly on a sound implementation of the arbitration process that is given in the 
next section. 



3 Esterel Implementation of the Arbitration Process 



In this section, we present a possible implementation of the arbitration process 
that can be used for a translation of our extended Esterel language back to 
standard Esterel. The basic idea of this arbitration process goes back to a DMA 
controller given by Martin [10]. However, the circuit given by Martin makes 
arbitration decisions at any point of time since it assumes that a single unit of 
time is sufficient for accessing the shared resource. However, this does not hold 
in our case and therefore, we need to adapt the arbitration. 

Now, what does the arbitration process have to do? It has to choose one of 
all requesting threads, i.e. one of the indices i where the corresponding request 
signal Qi is present at the current instant. The decision is then signaled via 
emitting a grant signal After getting access, the region statement i executes 
its critical section, and hence the arbitration thread must await the termination 
of Si (signaled by /a)- The next arbitration decision can be made when the 
release signal Ja is emitted by Pi. 

The Esterel implementation of the arbitration process for a region A with n 
region statements is given in Figure 1. There are n inputs q\, . . ., Qn that are 
emitted by the region statements for requesting access to the shared resource. 
The arbitration process emits one of the n outputs oi, . . ., o;„ for allowing access 
to one of the processes. 

We will now explain how the arbitration process works without going into 
details of the Esterel language. For this reason, we translate the Esterel program 
to a finite state machine by means of the Esterel semantics [2]. It is however 
reasonable to present an intermediate result of the translation and not the final 
one. In particular, we consider a combination of parallel running interacting finite 
state machines for the subsequent Esterel threads of the arbitration process given 
in Figure 1. These finite state machines are given in Figure 2. Moreover, we define 
for i e {1, . . . , n} the output signals cxi as := arb A Qi A {U A pi V static A 
Aj=i where static := Aj=i(A ^ ^Pj)- 

It is to be noted that this translation is based on the formal semantics of 
Esterel and is therefore sound wrt. the semantics of Esterel. To see the principle 
of the translation, we list the translation of a subsequent thread that sets the 
persistence flag pk ■ 
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module arbitrateA{gi, ■ ■ ■ , Qn, /a, ai, . . . , a„): 

signal ti,. . . ■ ■ ■ ,Pn,arb in 

loop 

abort sustain ti when arb-, 



abort sustain t„ when arb 
end loop 



j ! rotating tokens for daisy chain 



loop 

await Qi A ti; 

weak abort sustain pi when 
end loop 



/ / setting persistence 
! j for process 1 



loop 

await Qn A t„; j j setting persistence 

weak abort sustain p„ when -iQn f j for process n 

end loop 

loop 

await immediate \l"_^ Qi\ 
emit arb\ 
present 
then present 

case Qi A ti Api do emit ai; 

case Qn A tn A Pn do emit a„ ; 
end present 
else present 

case Qi do emit ai; 

case Qn do emit a„; 
end present 
end present; 
await /a 
L end loop 
end signal 
end module 



1 1 give acknowledge 
/ /when arbitration is 
/ /required 




Fig. 1. Implementation of the arbitration process in Esterel 
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^{ffk A tk) 





1 








loop 




loop 






await Qk 


Qk Atfc 


await Qk At k'. 






weak abort 





weak abort 






sustain pk 





sustain pk 






when 


^Qk 


when ^Qk 






end loop 




end loop 










t 


J 



In the thread for setting the persistence flag pk, there are two program locations 
where the control flow rests for the next point of time. These locations are 
indicated by a hat in the above finite state machine. It is easy to see that the 
above finite state machine matches with the corresponding one given in Fig. 2. 
The others are obtained similarly. 




Fig. 2. Transition diagram for the finite state machine of the arbitration process 
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To formally reason about the function of the entire arbitration thread, we 
derive now transition equations of the boolean state variables according to Fig. 2: 



init(ti) := 1 
init(tfe) := 0 
init(pfe) := 0 
init(ar6) := 1 



next(ti) := {arb tn) A {-^arb t\) 
next(tfe) := {arb tfe-i) A {-^arb tk) 
next(pfe) := Qk A {pk V tk) 
next{arb) := {~<arb A Ja) V {arb A ^ Vj=i 



static := Aj=i(^i ^ ~^Pj) ■= arb A Qk /\ {tk ApkV static A 

The state variables tk describe a ring of n states ti, . . . , tn where transitions 
are made from tk to n,od n)+i whenever an arbitration decision can be made. 
This models a round robin schema, i.e. there is a rotating token associated with 
the region statements: we say a region statement Rk ‘has the token’ whenever 
we are in state tk- Note that at each point of time, exactly one of the boolean 
state variables t\, . . . , tn is present. 

There are two reasons why a statement Rk may be granted access to their 
critical section: If cxk holds, then we have arb A Qk Atk Apk or arb A Qk A static A 
Aj=i ~'Qj- Both cases exclude each other: First assume arb A Qk Atk A pk holds. 
This means that in particular, tk A pk holds, and hence static can not hold, so 
that the second case can not hold either. On the other hand, assume arb A Qk A 
static A Aj=i holds. Then, static holds, that implies tk ^Pk- Thus tkApk 
can not hold which would be necessary to satisfy the first case. 

Therefore, there are two different reasons for an arbitration decision: firstly, 
the access may be granted by static priorities. If static holds, then the region 
statement Ri with the smallest index i is granted to execute its body. Secondly, 
if the region statement Rk that currently has the token {tk = 1 ) has set its 
persistence flag {pk = 1 ), then the static priorities are ignored and Rk is imme- 
diately granted access to the critical section. The persistence flags pk are used to 
establish the fairness of the arbitration: whenever Rk requests for accessing the 
critical section {pk = 1 ), the request remains until it is satisfied. Hence, there 
will be some time where region statement Rk receives the token and this event 
sets the persistence flag pk- If Rk is not granted to access its critical section at 
this point of time, the token will rotate another round. However, if the request 
has not been satisfied when the token returns again (this implies static = 0, 
since we then have tk A pk), then we know that Rk has been ignored at least for 
the last n arbitration decisions and will therefore be immediately granted access 
to the critical section. This assures the fairness of the arbiter. 

While a region statement executes its body statement, no further arbitration 
decisions are to be made. For this reason, we stop the rotation of the token 
during this time. This is done by introducing a further boolean state variable 
arb that is false iff one of the region statements currently executes its body 
statement. Arbitration decisions are only to be made when arb holds. Initially, 
arb holds since none of the region statements is in the critical section. Then, 
we are waiting until one of the processes requests for the access. If this is the 
case, one of these region statements is immediately allowed to execute its body 
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(cf. specification ’Immediate Grant’ below). Therefore, arb is unset and remains 
false until the termination signal Ja is emitted by the region statement that has 
been granted access to the critical section. 

4 Verifying the Arbitration Process 

Note that the implementation given in Figure 1 is only a particular choice of an 
arbitration process that can be replaced by any other that satisfies the following 
requirements that we present in temporal logic [11]: 

Exclusive: At each point of time, at most one Rk may enter the critical section: 

n f n \ 

A G ^ A 

k=i y j 

Only Requested: Only statements are granted to enter the critical section 
that request for an access: 



A G (ofe ^ arb A Qk) 
k=l 

Immediate Grant: Whenever arbitration decisions can be made, i.e. no pro- 
cess is currently in its critical section, and there are requests, then there will 
immediately be a grant: 



G 


n 

A - 


[Qj U aj] 


G 


ar6 A Y pfe — 


yVa.) 




_i=i 




•s 


Vfe=i / 


\fe=l / _ 



persistent requests immediate decision 



Fairness: The arbitration is fair, if we assume that all bodies Si terminate and 
if all entering requests persist either until they are granted (or forever): 



fvA 


^XF/^ 
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A G aj] 
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termination persistent requests 



n 



A -'FG [Qj A ^aj] 
i=i 

' V " 

fairness 



The latter condition is very subtle and is therefore explained in more detail. The 
first assumption is that each body Si of each region statement Ri = region A Si 
end region terminates in any case. Clearly, if this would not hold, we would not 
be able to guarantee the fairness. The second assumption is that once a region 
statement requests for accessing the critical section, it insists on requesting until 
it receives a grant to enter the critical section.^ Note that the assumption uses 

It is easy to see that the Pi’s of Section 2 implement this. 



3 




454 



Klaus Schneider and Viktor Sabelfeld 



a weak until operator which means that the assumption does also hold if the 
region statement is never granted access to their critical section (i.e. a{ remains 
false from a certain point of time). The fairness condition proves however that 
this can never happen. 

To verify the above specification for an arbitration process for n region state- 
ments, we used the linear time temporal model checker implemented at our in- 
stitute [12,13]. The Exclusive and Only Requested conditions of our specification 
above have been checked within a second even for large n, so that we do not list 
detailed runtimes for them. The experimental results that we obtained for the 
Immediate Grant and Fairness conditions are given in Figure 3 (SUN Sparc 10, 
300 MHz, Solaris 5.7, 640 MByte main memory). 




Region Statements 



Fig. 3. Runtimes for the verification of the arbitration process 



The automatic verification is in this case completely sufficient: we are able 
to verify the fairness of more than 40 region statements in less than one third of 
an hour. Therefore, we see that the implementation of our arbitration process 
lends itself well for model checking techniques. We therefore believe that also 
Esterel programs with mutual exclusive sections can be verified efficiently with 
model checking techniques. 
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5 Interactive Proofs 

The correctness can also be proven by means of a theorem prover such as the 
HOL system [14]. Again, the most complicated condition to prove is the fairness 
condition. The proof of the fairness runs in the following lines: First of all, it 
follows from the termination of the body statements that for any k, the region 
statement Rk will receive the token infinitely often, i.e. we have (1) Gftk for any 
k. Now, assume there exists some k such that (2) FG(^?fe A ^ak) holds, i.e., after 
some point of time to, it holds forever that the region statement Rk requests for 
their critical section, but is never allowed to enter it. As Rk will receive the token 
infinitely often (as any region statement does according to (1)), it will also receive 
the token after to- Let to + be the first point of time after to when Rk receives 
the token. Then, it follows that the persistence flag pk of this region statement 
Rk is set at to + ti. As Qk holds always after to by (2), pk remains true after to + ti 
by definition of pk - However, by (1), Rk will receive the token also infinitely often 
after to + t\, so let to + H + ^2 be the first time after to + 1\ when Rk receives the 
token again. By definition of the grant signals, this will immediately grant Rk 
access to their section (as now arb A Qk Atk Apk holds). Therefore, we obtain a 
contradiction, so that (2) must be false and the arbitration is fair for any number 
of threads. The other properties are easily proved by a simple consideration of 
the implementation of the thread arbitrate a ■ The present statements allow only 
one grant at a time. 

As a result, we can now establish a proof rule for the verification of Esterel 
programs with region statements. This rule can be used interactively to trans- 
form verification goals with Esterel statements with region statements into other 
goals that do no longer contain these region statements. The rule is obtained by 
the correctness of the arbiter in that we use the following proof rule: 

5 II A 1= A\=4' 

5 1= iF ^ 

The above rule eliminates a thread A in that it is replaced by a property that 
is already known to hold for A. We use this rule to eliminate the arbitration 
thread arbitrate a after replacing the region statement as outlined in Section 2. 

As a result, we obtain the following proof rule, where pi, . . . , Qn, /^, oi , . . . , o;„ 
are disjoint signals that do neither occur in S{Ri, . . . , Rn) nor in 4’: 

S{Ri,...,Rn)[=4> 

^ ALi G (ak A]=1, jjik 

Afe=l ^ ^ ttk) A 

5(Pi, . . . , P.) h G [(VLi - X [(a;=i -a,) GJa] A 

(g [(V”=i «.) - XF/^]) ^ A;=i -FG [q, a 
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The rule is to be read as follows: Given that our task is to prove that an Esterel 
program S{Ri, . . . , Rn) that contains the region statements Ri,. . . ,Rn belong- 
ing to the same region A satisfies the property <P which may be given in a 
first-order temporal logic formula as described in [15], Then, it is sufficient to 
prove that the Esterel program S{Pi , . . . , P„) satisfies the property <P, where we 
can use as additional assumptions the above listed properties. The rule simplifies 
the proof task since it encodes the semantics of the region statements by re- 
placing them with corresponding specifications. Note further that in the reduced 
goal, no arbitration process occurs, since we already know that it is correct. In 
fact, the arbitration process has been replaced with the new assumptions that 
can therefore be viewed as an declarative form of our arbiter. 



6 Avoiding Deadlocks 



We have already proved that the translation of our region constructs to standard 
Esterel is safe, i.e., at any point of time any critical section is executed by at 
most one thread, and fair, i.e., no thread must wait infinitely long for accessing 
the critical section (provided that any thread leaves the critical section after a 
finite amount of time) . 

Of course, this does not mean that there are no deadlocks. Clearly, when we 
only have one region, i.e., if all region statements refer to the same region name, 
then we can state that the program is free of deadlocks. In this case, according to 
the syntactic restriction there are no nested region statements, and the deadlock- 
freedom property follows from the fairness of the arbitration process. In the case, 
we have more than one region name, it is however obvious, that deadlocks may 
occur, as given by the following simple program Pdead- 



region A 
pause; 
region B 



region B 
pause; 
region A 



end region 
end region 



end region 
end region 



In a first step, the left hand thread requests for an access to the region A, while 
the right hand thread request for an access to the region B. If there are no other 
requests, the arbitration thread for region A will grant the left hand thread above 
the access region A, and analogously, will the right hand thread receive a grant 
to access region B. At the next point of time, however, the left hand thread 
above requests for region B, while the right hand thread requests for region A. 
Both requests can not be granted since both arbitration threads are now not in 
the arbitration mode since both critical sections for regions A and B are already 
accessed. 

There are well-known strategies for avoiding deadlocks [16] and we can also 
use such strategies in our case. In the remainder of this section, we briefly de- 
scribe further syntactic restrictions that assure the absence of deadlocks. We em- 
phasize that these restrictions can be easily checked at compile time. However, 
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there are programs that do not fulfill these restrictions, but are nevertheless free 
of deadlocks. However, as deadlock-freedom is in general undecidable, decidable 
criteria like our ones are necessary. 

Our restrictions are based on the dependency relation -<p on regions in a 
given program P that is defined as follows: we say that a region A depends on 
a region B, denoted as A Ap B, if P contains a region statement region B S 
end region, such that S contains a region statement region A . . . end region. 
Hence, some region A statement is nested in a region B statement. We denote 
the reflexive-transitive hull of Ap by Ap. 

As the example Pdead above shows, we can not guarantee the deadlock- 
freedom property if the region dependence relation Ap is cyclic. We therefore 
impose now the syntactic restriction that Ap must be acyclic. It is easily seen 
that this is equivalent to the property that Ap is antisymmetric which means 
that Ap is a partial order. Hence, we can extend the partial order Ap to a linear 
order <p of all regions occurring in P: 

Aq <p ■ ■ ■ <p Am 

Note that Ai <p Aj with i ^ j implies either Ai Ap Aj or that there is no depen- 
dency (Ap) between Ai and Aj. In any case, it follows that Ai <p Aj with i j 
implies Aj -^p Ai. We eliminate all region statements region Ai . . . end region 
successively for i = m, . . . ,0 applying the described transformation to standard 
Esterel using the new thread arbitrate ■ 

The restriction that Ap must be acyclic implies deadlock freedom. To see this 
note that the acyclicity of Ap means that if some region A statement is some- 
where nested in the body of a region B statement, then there must be no region 
A statement in the program P whose body contains some region B statement. 
Hence, at any time of the execution, if a thread Si has already accessed regions 
Mi = {Afe. . . . , Afe. } and requests for another region At, then it follows that 
k < minj/cij, . . . , }. This is a simple consequence of the construction of <p: 

the regions that are requested from one thread are requested in descending order 
<p- 

Suppose now the execution of the modified program would come to a dead- 
lock. Then there are threads Sq, ..., S(,-\ so that Si requests for some re- 
gion Afe. and has already accessed regions Mi = {Afe. ^, . . . , ^ }. Define 

nii := min{/cgi, . . . , By the above explanation, it follows that ki < rrii 

holds. Suppose now without loss of generality that A^. belongs to Mppi) g. 
Now, it follows that m(ipi)modt < ki, and therefore m(^ipi)modt < But 
then, we have mo < mi < ...m£_i < mo, a contradiction. For this reason, a 
deadlock can not occur. 

Surely, there exist deadlock-free programs with a cyclic region dependence 
relation Ap: For example, take the program Pnodead that results from the pro- 
gram Pdead by replacing the parallel statement ‘||’ by the sequential statement 

The region dependence relation Ap of that program is cyclic, since A Ap B 
and B Ap A holds, but the program is deadlock free. 

This could be handled by introducing the ‘dynamic region dependence re- 
lation’ Ap, which relates accessed regions and currently requested regions (in 
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the above example, where we considered the threads S'o, we would only 

have Ak^ Ap Ak--). Of course, Ap changes during the execution of the program. 
A deadlock occurs iff Ap becomes cyclic. Hence, the region management aims 
at keeping Ap acyclic. Consider again the program Pnodead, it is easily seen that 
all dynamic region dependence relations arising in the execution of Pnodead are 
acyclic. 

However, it can not be recognized whether all dynamic region dependence 
relations arising in the execution of an arbitrary program are acyclic: this prob- 
lem is undecidable since the undecidable reachability problem can be reduced to 
this problem. That is why our restriction on ‘static region dependence relations’ 
can be considered as a reasonable restriction guaranteeing deadlock freedom of 
extended Esterel programs. 

Our syntactic, static check for deadlock- freedom is motivated by hierarchic 
resource managements in operating systems [16]: Usually, interrupts coming from 
the main memory are served before interrupts coming from hard disk, so that 
an acyclic relation between the resources is established. This is quite similar to 
our approach, but at another abstraction level. 

7 Conclusions 

We have shown how the synchronous language Esterel can be extended by a 
new statement so that mutually exclusive sections are provided by the syntax. 
We have moreover shown how the thereby extended Esterel language can be 
translated back to standard Esterel by surrounding the critical code sections 
by a simple protocol, and adding a separate arbitration thread for each region. 
Also, we have proved the correctness of this arbitration thread by means of model 
checking the temporal logic specifications for some numbers of threads, and also 
by a paper-and-pencil proof for arbitrary numbers of threads. In particular, we 
have proved that the solution is safe, i.e., at any point of time any critical section 
is executed by at most one thread, and fair, i.e., no thread must wait infinitely 
long for accessing the critical section (provided that any thread releases the 
critical section after a finite amount of time). Moreover, we have given syntactic 
restrictions that guarantee freedom of deadlocks. 

Acknowledgement. We thank our collegue M. Baldamus for carefully reading 
the paper and giving useful comments. 
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Abstract. Symbolic model checking is a powerful formal-verification 
technique which has been used to analyze many hardware systems. In this 
paper we present our experiences in applying symbolic model checking 
to software specifications of reactive systems. We have conducted two in 
depth case studies: one, using the specification of TCAS II (Traffic Alert 
and Collision Avoidance System II), and the other using a model of an 
aircraft electrical system. Based on these case studies, we have gained 
significant experience in how model checking can be used in to analyze 
software specifications, and have also overcome a number of performance 
bottlenecks to make the analysis tractable. 

The emphasis of this paper is the uses of model checking in the analysis 
of specifications. We will discuss the types of properties which we were 
able to evaluate in our case studies. These include specific errors we were 
able to identify, as well as general properties we were able to establish for 
the systems. We will also discuss, in more general terms, the potential 
uses of symbolic model checking in the development process of software 
specifications. 

Keywords. Formal methods, formal verification, symbolic model check- 
ing, binary decision diagrams, software specification, finite state repre- 
sentations. 



1 Specification of Reactive Systems 

Reactive systems are central to modern technology. Examples of their deploy- 
ment range from air traffic control systems to advanced medical devices. Since 
they are often deployed in safety critical applications where their malfunctioning 
could cause significant injury or loss of life, their correct implementation is of 
great importance. 

In studying the problem of how to better design these systems, we concentrate 
on the specification level. Correct specification is particularly important, since 
it is widely recognized that errors introduced early in system design are the 
most difficult and expensive to fix. We restrict attention to specifications which 
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are represented as finite state machines, using languages such as statecharts or 

RSML. 

The broad goal of the work is to develop techniques that allow us to increase 
our confidence in specification. This includes being able to show that specifica- 
tions obey general design rules, as well as satisfy particular domain dependent 
properties. We are interested in incorporating these techniques into the devel- 
opment process of the specification — using them to debug the specification as 
it is being created, as opposed to just using them in a validation phase to verify 
the specification when it is complete. 

2 Model Checking Technology 

Model checking is a formal verification technique based on state space explo- 
ration. Given a state transition system and a property, model checking algo- 
rithms exhaustively explore the state space to determine whether the system 
satisfies the property. Properties are often expressed in a temporal logic such as 
CTL (Computation Tree Logic) [9]. An important aspect of model checking is 
that when a formula is discovered to be false, a counter example is provided. 
This helps with the understanding of the source of the error, which could be in 
the model, the translation, or even in the formula being evaluated. 

A natural concern about model checking, is that since the entire state space 
must be explored, the run time of algorithms is at least proportional to the 
size of the state space, which is potentially enormous. The breakthrough, which 
has allowed model checking to be applied to systems with much larger state 
spaces, was to use an implicit representation of that that space, and to use 
symbolic techniques for exploration [4,16]. Instead of visiting states one at a 
time, symbolic model checkers visit sets of states in each step. The underlying 
representation which is generally used is the Binary Decision Diagram (BDD) [2] . 
In many practical cases, the size of the BDDs needed to represent the sets of 
states used in the model checking algorithm is small. The size of the BDDs used 
generally determine the performance of the algorithms. Much of the technical 
model checking literature deals with the issue of managing BDD size. 

Model checking was first used in the analysis of hardware designs, and is 
now recognized as an important formal tool to use when building hardware 
systems. When we started our work on applying model checking to software, 
it was an open question whether or not model checking would yield interest- 
ing results on software. There was a belief by some researchers that software 
specifications lacked the requisite structure to allow model checking to succeed. 
However, there have been a series of case studies by ourselves [5,6] and other 
researchers [1,10,14,19] reporting positive results for applying model checking to 
software. The impetus for the work was to determine i/ model checking could be 
used to analyze software specifications, but now the issue has shifted to deter- 
mining how to get the most leverage in using model checking the design process. 
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3 Application of Model Checking to Software 

The question of the feasibility of model checking can be phrased in a number of 

ways. We distinguish between these to emphasize that the issue is not “does it 

work or not” , but “how can the technology be most effective.” 

Modeling the system. The first step in model checking is to translate from 
the specification language (in our case RSML [15] or Statecharts [If]) to the 
representation of the model checker (we used SMV [16]). When this is done, 
the basic model can be constructed, and a reachable state space is computed. 
It is possible that the initial step could fail because of BDD size explosion, 
so a negative result could be reached prior to evaluating any formula. In our 
case studies, we had to do a substantial amount of work to reach the point 
where the initial construction of the model was feasible. 

Evaluation of Properties. The second step in establishing feasibility of model 
checking is to show that there are non-trivial properties which can be evalu- 
ated. The standard test (to claim a positive result for model checking), is to 
find previously undiscovered bugs in the specification under analysis. Note 
that this changes the emphasis to falsification - the desire is to show the 
specification does not work. The absence of falsifying examples is not verifi- 
cation. We believe that this will be one of the major uses of model checking: 
as a debugging tool for identifying errors. This will be an important tool to 
improving overall quality by augmenting the ways that errors can be found. 
We discuss below various types of properties which can be evaluated. 

Range of Properties. The next question is what range of properties can be 
evaluated. There are limitations on BBD based symbolic model checking 
which have ramifications on the types of properties which can be checked. 
For example, BDDs do a poor job of representing multiplication, which limits 
our ability to check properties which involve complicated arithmetic. 

Performance. The performance question is often the issue between a check 
being feasible and infeasible. For example, our first successful check (of a 
trivial property) took 13 hours. This was later reduced to just minutes by 
modifying the algorithm. In many situations, the tolerable wait for a result 
is probably measured in minutes (because of interactive use, or because of a 
group of checks being performed at once. The performance of the algorithm 
is directly correlated with the size of the intermediate structures which are 
generated. 

Ease of use. The long range goal is to develop model checking technology so 
that it can be used by engineers who are not experts in model checking. Our 
work has not reached the stage where this can be assessed. Our success in the 
case studies required modifying the underlying model checking algorithms. 

Development process. Our view is that the critical question is how to use 
model checking while developing specifications. One can imagine a develop- 
ment methodology where a set of invariants are maintained as components of 
a system are designed. Components can initially be modeled at a high level 
of abstraction — either by specifying their desired behavior, or by using 
non-deterministic devices. 
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4 Case Studies 

We have conducted two major case studies where we applied symbolic model 
checking to software specifications. The first study was our TCAS study [5], 
and the second involved a model of an aircraft electrical system [6]. The second 
study was done in collaboration with engineers from the Boeing Corporation. 
These studies both involved large, real world specifications, written by other 
people. Size was an important issue, since we wanted to validate the technique 
on specifications of commercial scale, as opposed to just on toy problems. 

4.1 TCAS 

TCAS II is an airborne collision avoidance system required by the United States 
Federal Aviation Administration (FAA) on most commercial aircraft that enter 
U.S. airspace. The TCAS-equipped aircraft is surrounded by a protected volume 
of airspace. When another aircraft intrudes into this volume, TCAS II generates 
warnings (traffic advisories) and suggests possible escape maneuvers (resolution 
advisories, or RAs) in the vertical direction to the pilot. 

The system requirements specification of TCAS II, a 400-page document, 
was written in RSML. The first obstacle to analysis was its sheer size. As a first 
attempt we decided to try to verify a portion of it, namely a state machine called 
Own- Aircraft , which occupies about 30% of the specification. Own- Aircraft has 
close interactions with another state machine called Other- Aircraft , which tracks 
the state of other aircraft in the vicinity and possibly generates RAs. Up to thirty 
other aircraft can be tracked. From the RAs given by all the instances of Other- 
Aircraft , Own- Aircraft derives a composite RA and generates visual and audio 
outputs to the pilot. 

We were able to evaluate various properties of the specification, including 
some which revealed errors in the specification^. One example was testing the 
following: 

AG ( (Composite-RA = Climb 

& Composite-RA-Evaluated-Event) 

-> Displayed-Model-Goal >= 1500) 

A pilot receives two different outputs from TCAS when being given instruc- 
tions on avoiding another aircraft: an action (Climb or Descend), and a desired 
altitude rate of change. The query is checking that when the pilot is instructed 
to climb, the rate of altitude change is positive. There was a fairly complicated 
counterexample to this, which involved an intruder aircraft changing its climb 
rate in adjacent time intervals. Further discussion of the properties we were able 
to check is given below. 

We now mention a few of the major steps in the analysis. We made signifi- 
cant use of non-determinism in our analysis. This means that some of the state 

^ We were working with a preliminary version of the specihcation (Version 6.00, March 
1993) . We do not know if the issues are present in later versions of the specihcation. 
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machines were represented as machines which could make arbitrary transitions, 
instead of the transitions made in the specification. Using non-deterministic ma- 
chines means that the analysis is conservative with respect to safety properties. 
Using non- determinism allowed us to apply model checking in an incremental 
fashion: we only needed to have portions of the system translated in order to 
check properties, and we could refine our translation in response to results of 
the model checker. (This was important, since it allowed us to catch errors in 
our translation) . There were portions of the system, involving multiplication and 
division in the transition relation which we were not able to model. We replaced 
these by non-deterministic operators, which gave a superset of possible transi- 
tion. Again, this was done so that we could evaluate properties without having 
a complete model of the system. 

State machines are a natural model for reactive systems which interact with 
the outside world. The inputs to the system are external events. In TCAS, an ex- 
ample of an external event is a transponder signal received from another aircraft. 
State machines also generate internal events which are used to communicate be- 
tween different submachines. There has been much discussion of the semantics 
of these different types of events [13,18,12]. One issue is whether the internal 
events can be active when there are external events received. The TCAS model 
(using RSML) uses the synchrony hypothesis, which is that all internal events 
are processed between external events. One way of viewing this is that internal 
events are infinitely faster than external events. (This is reasonable for systems 
such as TCAS, where the separation of external events is measured in seconds). 
To model synchronization, a state variable stable is introduced to keep track 
of when there are active internal events. The handling of synchronization has a 
major impact on the performance of the model checking algorithms. 

A major difference between the TCAS specification and many hardware spec- 
ification is that some of the transition rules in TCAS depend on arithmetic op- 
erations. Examples include comparing altitudes to determine separation, and 
estimating positions based upon velocities and accelerations. Arithmetic involv- 
ing addition and comparison can be handled, provided that it is represented at 
the bit-wise level, and the bits are interleaved appropriately. However, multi- 
plication operations are not amenable to HDD representation [3], and this did 
limit the portions of the specification that we could analyze. Proper handling 
of multiplication is an open problem. In other work, we have attempted to in- 
tegrate constraint solving and model checking to handle transitions based on 
multiplications [7]. 



4.2 Aircraft Electrical System 

Our second case study was an analysis of a statecharts model of the electrical 
power distribution (EPD) system on the Boeing 777 aircraft. We stress that the 
statecharts model was developed for research purposes and does not represent 
the actual requirements used to develop the on-board system. As such the model 
by intent did not include all the logic necessary for a complete specification. The 
model was intended as a high-level abstraction of the electrical system, which 
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included only the logic necessary to accomplish the goals of a wider airplane 
system analysis [17]. 

The purpose of the EPD system is to distribute AC and DC power to other 
airplane systems. It comprises separate interconnected distribution systems in- 
cluding main AC power, backup AC power, DC power, standby power, and 
flight controls power. Electrical power is distributed from power sources to power 
busses via a number of relayed circuit breakers. Failures of the power sources or 
circuit breakers are automatically detected and isolated. We focus on the portion 
of the statecharts that models the main and backup AC distribution subsystems. 

One of the requirements of the electrical system was that it supports a degree 
of redundancy - components should remain powered in spite of several failures. 
Checks contingent on a number of failures could easily be represented in the 
logic, so we were able to evaluate various fault tolerant properties. 

Two properties we checked were “Not only should the busses be powered 
when there are no failures, they should be powered by different sources” and 
“The main busses should in fact tolerate one failure in the power sources or 
circuit breakers using the formulas 

AG ((Stable & No-Failures) 

-> Separate-Sources) 

and 

AG ((Stable & At-Most-l-Failure) -> main) 

respectively. Both of these properties failed for essentially, the same reason: there 
was a subtle modeling flaw in specifying the circuit breaker. The failure of a cir- 
cuit breaker and its subsequent recovering were represented as boolean variables, 
and not as events, so a transistion was not made inside the circuit breaker after 
its recovery, and it was left in an incorrect state. The sceneries to trigger the 
error were moderately involved. For example, in the second example it involves 
a failure in a circuit breaker, a change in inputs to induce a state change in its 
controller, the circuit breaker’s recovery, and a subsequent failure in one of the 
power sources. 

5 Uses of Model Checking 

The prime use of model checking is as a debugging tool. Specific properties are 
tested, and when a violation is found, a counter example is given. In contrast to 
verification, model checking is used to find errors, not prove correctness. Model 
checking can be used in conjunction with other testing methods (such as simu- 
lation) to gain confidence that errors have been found and eliminated. 

A fundamental question in applying model checking is “What to check?” . 
Our experience is that the properties of interest divide into two broad classes: 
domain dependent, which require understanding of the domain, and domain 
independent, which can be considered as “design rules” for specifications. 
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5.1 Domain Dependent Properties 

A key to our success in the two case studies was access to experts on the sys- 
tems that we were working with. We would not have been able to identify the 
properties to evaluate for the TCAS study without this expertise. Issues such as 
looking checking the consistency of the outputs to the pilot (advisor and climb 
rate) would not have occurred to us. The understanding of counterexamples also 
required significant domain knowledge. It was necessary to thoroughly under- 
stand the counterexamples in order to determine the type of the error. We do 
not believe that it will be possible to reduce the role of the domain expert in the 
model checking process. 

In our study of the aircraft electrical system, we also worked with domain 
experts (the designers of the model). In this study, the properties to test were 
more accessible. We had a document which outlined a set of fault tolerance 
requirements. These were phrased in terms of probability of failure, but there 
was a correspondance between this and bounding the number of simultaneous 
failures. Expertise was still necessary in order to clarify several of the properties 
that had to be tested. Model checking turned out to be an excellent tool to 
use for the evaluation of fault tolerance, since the number of failures could be 
included in the precondition of the property being checked. 

5.2 Domain Independent Properties 

Domain independent properties can be viewed as design rules that specifications 
should satisfy. An example of a property that is quite easy to check is whether the 
state transitions are deterministic: is it the case that every state can have at most 
one transistion enabled at a time. This can be tested by defining a property which 
tests for simultaneoulsy active transitions in reachable states. The reason why it 
is generally argued that deterministic transitions are important is that if there is 
a choice in the behavior, then different implementations may behave differently. 
A related property is “function consistency” . If a function is defined in terms of 
cases, it is natural require that the cases are mutually disjoint. Discussions of 
other domain independent properties can be found in our papers [5,6]. 

6 Performance 

Both of our case studies involved large specifications which generated models 
which were close to the maximum size which could be evaluated with a model 
checker. In the TCAS study, the model had a global state space with 227 Boolean 
variables, 10 of which are for events, 36 for the states of Own- Aircraft , 19 for 
the states of Other-Aircraft , 134 for altitude and altitude rates, 22 for inputs 
other than altitude and altitude rates, and 6 for other purposes. The size of the 
state space is about 1.4 x 10®®. The size of the reachable state space is at least 
9.6 X 10®®. In the electrical system study, there are 33 two-state machines, 23 
Boolean inputs, and 34 events, for a total of 90 Boolean state variables, or about 
10^’^ global states, of which at least 10^® are reachable. 




Experiences with the Application of Symbolic Model Checking 467 



Our general experience is that the performance question is between feasibility 
and infeasibility as opposed to optimizing performance. Most of our successful 
checks ran in under 10 minutes using about 10 megabytes of memory. Unsuc- 
cessful checks were usually terminated after several hours. Failing computations 
generally had excessively large internal (BDD) representations. 

Our initial attempts to check formulas in both the TCAS and the EPD stud- 
ies were unsuccessful. In both cases we were forced to make significant changes 
to model checking algorithm, and to our methods of translating from the state 
machine model to the representation for the model checker. More detailed de- 
scriptions of our performance enhancements can be found in our papers: [5,8,6]. 
Our methods for addressing the performance problems have included: 

Bitwise arithmetic. The order of variables in a BDD can influence it’s size. 
We needed to interleave the variables corresponding to the bits of binary 
data. This was done by a transformation which was applied when compiling 
to the source language of SMV. 

Search Order. We found it necessary to modify the search algorithms used by 
SMV. One modification involved storing information during a forward search 
to make generation of counter examples more efficient. The choice between 
forward search and backwards search was often important. 

Short circuiting. This technique reduced the number of BBD’s generated by 
stopping the iterations before a fixed point was reached. 

Making exclusive events explicit. This allowed backwards search to be per- 
formed much more efficiently reducing the size of BDD’s. 

Partitioning strategies. One of the ways to reduce the size of the BDD for 
the transition relation is to decompose it several BBD’s with disjunctive or 
conjunctive partitioning [4]. 

Abstraction One abstraction technique that we applied was to identify por- 
tions of the system the were not relevent to a check (with a conservative 
analysis) , and remove that part of the system to reduce the size of the model. 
One of the keys to making this work well is to be able to identify false de- 
pendencies. 

Synchronization. Our representation of state machines distinguihed between 
macrosteps (for outside events) and microsteps (for internal events) . Inside a 
macrostep, all internal events would be executed, so the next macrostep could 
not start until no more internal events could be generated. We discovered 
that performance could be greatly improved if we made the synchronization 
process as regular as possible, even at the expense of increasing the number 
of states, or the lengths of event chains. 

7 Conclusions 

The goal of our work has been to show that symbolic model checking can be used 
in the analysis of software specifications. We have conducted case studies on real 
specifications, and have had success in identifying errors in the specifications 
that were not previously known. We have also developed techniques improve 
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the performance of the model checking algorithms, and allow checks to be made 
which were previously intractable. We are optiministic about the future of model 
checking in the software development process. There is still much work to do in 
refining the algorithms and developing tool support for software model checking, 
but there is a growing body of evidence that model checking is applicable in the 
software domain as well as in the hardware domain. 
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Abstract. This paper reports on a non-trivial case-study carried out in 
the context on the German correct compiler construction project Verifix. 
The PVS system is here used as a vehicle to formally represent and 
verify a generic checker routine (run-time result verification) used in 
compiler back-ends. The checker verifies the results of a sophisticated 
labeling process of intermediate language expression trees with instances 
of compilation rule schemata. Starting from an operational specification 
(i.e. a set of recursive PVS functions), necessary declarative properties 
of the checker are formally stated and proved correct. 

Keywords: formal verification, checker-based program verification, generic 
specification 



1 Introduction 

The German project Verifix on compiler verification aims at developing inno- 
vative methods for the construction of correct realistic compilers for practically 
relevant source languages and concrete target architectures. Correct execution 
of source programs depends on the correctness of the binary machine code exe- 
cutable, thus either the final executable has to be verified or the compiler used 
is to be shown correct [3]. 

A realistic state-of-the-art compiler is a large and complex program system 
consisting of many hard, highly optimizing algorithms which are difficult to ver- 
ify since mathematical inductive arguments often fail. For example, the code 
generation phase of a compiler often uses clever routines for register allocation, 
instruction scheduling or pipeline optimizations. For this reason, a more prac- 
tical modular approach is taken: we use a checker-based approach to program 
verification, which works if partial correctness suffices (i.e. rather no result than 
a wrong result). It is often much easier to check the correctness of a given result 
at run time than to verify the generating algorithm and its implementation. In 
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our case, for instance, we would rather check that every assigned register is free 
and available than totally verify the sophisticated register allocation algorithm. 
Thus, one can concentrate on the verification of (in general) small checking (fil- 
ter) routines built into the code in order to establish partial correctness of the 
entire program. Of course, this only makes sense if the verification of the checker 
is indeed easier than the verification of the program whose results are checked 
[4,6]. Checkers have been used to ensure type correctness properties of a C sub- 
set compiler [7] and to verify the compilation of synchronous languages to C [9], 
but not yet to “verify” totally a machine code generation procedure. 

In this paper the PVS specification and verification system is utilized to 
formally verify the specification of such a checker program to be used in the 
back-end part of a compiler. The back-end translates linear intermediate code 
(i.e. sequence of assignments of expressions) into linear assembly code. This 
back-end is to be generated from a set of local translation rule schemata and 
additional components such as optimized register allocators and schedulers. The 
rule schemata were independently verified with respect to source and target 
language semantics [1]. 

The part of the compiler we are to check gets as input an intermediate lan- 
guage expression tree and outputs a labeled expression tree. The labels consist 
of the rule used to compute the node, assignments of the register and numeri- 
cal variables to actual registers and values respectively, as well as the schedule 
number of the rule. 

Our formalization is generic with respect to the languages and translation 
rules. It is realized as a parameterized PVS theory. The specification being writ- 
ten in an operational style is executable within the prover. It has been applied 
to a small realistic example of translation from the intermediate language MIS 
to DEC Alpha assembly code. 

We present these results as follows: the next section gives a brief introduction 
to PVS. Sect. 3 outlines the principle of generator-based back-end generation. 
In Sect. 4 the PVS formalization of the checker is presented and declarative 
correctness properties are stated, formalized and proved correct. All PVS theories 
and proof scripts are available from the authors upon request. 

2 A Brief Introduction to PVS 

The PVS system [8] combines an expressive specification language with an inter- 
active prover /proof checker. The PVS specification language builds on classical 
typed higher-order logic with the usual base types, bool, nat, among others, the 
product type constructor [A,B] and the function type constructor [A->B] . The 
type system of PVS is augmented with dependent types and abstraet data types . 
The special type TYPE designates an unspecified type, and TYPE+ an unspeci- 
fied non empty type. A distinctive feature of the PVS specification language are 
predieate subtypes: the subtype {x: A I P(x) } consists of exactly those elements 
of type A satisfying predicate P. Predicate subtypes are used, for instance, for 
explicitly constraining the domains and ranges of operations in a specification 
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and to define partial functions. Sets are identified with their characteristic pred- 
icates, and thus the expressions pred [A] and set [A] are interchangeable. For 
a predicate P of type pred [A] , the notation (P) is just an abbreviation for the 
predicate subtype {x: A I P(x)}. 

In general, type-checking with predicate subtypes is undecidable; the type- 
checker generates proof obligations, so-called type correctness conditions (TCCs) 
in cases where type conflicts cannot immediately be resolved. A PVS expression 
is not considered to be fully type-checked unless all generated TCCs have been 
proved. PVS only allows total functions, hence it must be ensured that all (recur- 
sive) functions terminate. For this purpose, a well-founded ordering or a measure 
function is used. The definition of a recursive function f generates a TCC which 
states that the measure function applied to the recursive arguments decreases 
with respect to a well-founded ordering. A built-in prelude and loadable libraries 
provide standard specifications and proved facts for a large number of theories 
(we use for instance the finite_set type, the upto and subrange subtypes of 
nat, the empty? predicate over sets, the choose function to extract an element 
from a set, etc. . . ). Specifications are realized as possibly parameterized PVS 
theories and theory parameters can be constrained by means of assumptions. 
When instantiating a parameterized theory, TCC’s are automatically generated 
according to the assumptions. 

Proofs in PVS are presented in a sequent calculus. There exists a large num- 
ber of atomic commands (for quantifier instantiation, automatic conditional 
rewriting, induction, etc. . . ) and built-in strategies generating proofs for the 
easiest subgoals automatically. 

3 Back-End Generation by Term Rewriting 

The back-end of a compiler is the part of the program in charge of the final trans- 
lation from a low-level intermediate language to assembly or machine code (this 
phase is usually called code generation). Its main task is to generate sequences 
of target level instructions to compute the value of intermediate language ex- 
pressions. The state-of-the-art code generators are themselves generated from a 
set of optimized translation rules schemata and include complex mechanisms for 
optimal rule selection, register allocation and operation scheduling. 

The rule schemata are local translation rules associating a sequence of assem- 
bly code to an expression subtree, the latter being arbitrarily complex depending 
on the level of resource and time optimization. They are parameterized by use of 
variables in place of registers and constants, and the set of registers or register 
variables used in input, output and temporary storage (in the generated code) 
are given. These rules are mechanically proved correct with respect to the se- 
mantics of the intermediate and target languages independently from the whole 
process in PVS using a user dehned strategy [1]. 

As already stated, we want to avoid the verihcation of the specihcation, let 
alone the implementation, of the rule selector/allocator/scheduler taking care of 
the labeling of the expression trees. This is possible by verifying the output of the 
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procedure at run time, aborting the compilation if ever an error occurs (giving 
the available elements for the correction of the bug). The checking procedure 
must however be proven to detect any case where the code that will further be 
generated from the labeled tree will not exactly implement the computation of 
the translated expression. 

Figure 1 gives an overview of the compilation process. As illustrated, the 
back-end generator must be partly verified to make sure that the verified code it 
uses is not altered in any way, and that the components on the correctness critical 
path are correctly connected. The generated back-end contains non verified code 
whose results will be checked at run time by the verified checker. 




Fig. 1. Overview of the compilation with detailed back-end principle 



The straightforward way to make sure that the labeling was correctly done is 
to extract the code of the labeled tree according to the schedule, and show that 
this code implements the computation of the initial expression. This is clearly 
unpracticable at runtime, as we would have to deal with the semantics of the 
languages. But the rule schemata were already proved correct, and thus the 
translation will be correct if the rules are “properly” used. The proper use of a 
rule being hard to define formally, we will verify properties that are intuitively 
needed and give elements to show that these properties actually imply a correct 
resulting code given a correct implementation of the code extractor. 
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The labeling process actually represents a covering of the expression tree with 
instances of the expression trees of the translation rules used, each of the rule 
trees being rooted at the node for which the rule is applied (as the expression 
part of the rules may be a single unary operator as well as a complex expression). 

The correctness requirements of this process are to make sure the covering 
is correct (every expression node is covered with a rule node with a correct 
operator), to verify the schedule (subterms must computed before their use), 
to verify the value passing from children to parent rule (output register of the 
child rule is the same as the corresponding input register of the parent rule - via 
assignment of register variables) and to verify that the values computed are not 
overwritten before their use. 

Let us continue with an example exposed in [10]. The source language state- 
ment V := V + 1 will be compiled to the following MIS expression (the storage 
address of V being 8 relatively to the local pointer - which is stored in register 
1 on the DEC-Alpha): 

intassign{local{intconst{8)) , intadd{content{local{intconst{8))), intconst{l))) 

and this expression will be compiled using the following rule schemata (remark 
the encoding size of constant operators): 

rule! : intassign{local{intconstl6{i)),reg{X)) •; STQ{X,i,l) 
rule2 : intadd{reg{X),intconstl6{i)) Y;ADDI{X,i,Y,Q) 
rule8 : content{local{intconstl6{i))) Y; LDQ{l,i,Y) 

to the following DEC-Alpha code: 

LDQ{1, 8, 3);ADDI{8, 1, 3, Q); STQ{8, 8, 1) 

Figure 2 sketches the problem of verifying value passing between the codes 
generated for a sub-expression and the top operator. As one would expect, the 
three expression trees and the two assignments involved make this verification 
somewhat complex. This lead us to define an operational formulation of this 
verification process, and similar specifications for the others properties. 
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Fig. 2. Verification of value passing 





Formal Verification of a Compiler Back-End Generic Checker Program 



475 



These specifications are hardly usable as is in a proof context, thus more 
declarative properties have been stated and proved to hold for any labeled tree 
successfully checked. These properties shall be the basis for all future proofs. 

4 Formalization of the Checker 

We tried to keep the specification as generic as possible by abstracting over 
the syntax and structures. We had though to select a proper structure for the 
expression trees in order to be able to use induction in the proofs. We defined 
the abstract datatype Tree (nodes have a value, a left and a right son, leaves 
are terminal) parameterized by the type of the values of the nodes and PVS 
generated the induction theorems for this structure. We used this type for both 
the labeled trees and rule trees encoding. The parameterization of our PVS 
theory is presented in Fig. 3. 

The four verifications described in the previous section are undertaken by 
four independent predicates (functions of type [Node->bool] ); covercheck?, 
schedulecheck?, valuecheck? and overwritecheck?. Some of these predicates 
have a straightforward formulation with mutually recursive functions. Unfortu- 
nately, the PVS system does not support mutual recursive functions due to 
termination problems. We therefore wrote them using a single recursive function 
with a flag indicating which of the bodies is to be evaluated. Termination of 
these functions is ensured by a measure function using a lexicographical order- 
ing of the flag and the measure functions of the bodies. Figure 4 presents the 
PVS code for the valuecheck? predicate. 

As stated before, these operational formulations of the checkers with their 
flagged recursion schemes are hardly usable in a proof context, and is therefore 
not very helpful to establish the correctness of the back-end. To prove that the 
code to be generated from the labeled tree will actually implement the expression 
compiled, we will have to induct on the number of code extraction steps and 
therefore need more usable declarative properties about the resulting machine 
code. As these properties are not easily expressed, we identified the situations 
that do cause an error: 

— covering problem: the root of a rule tree does not match the operator of 
the referencing labeled node, or a node from a rule rooted somewhere in 
the labeled tree does not match the operator of the labeled node it covers 
(except if the rule node is a register and covered node is labeled with a rule) 

— schedule problem: there is a node, labeled with a rule, having a schedule 
number smaller than the one of its child node also labeled with a rule. 

— input problem: there is a register node from a rule rooted somewhere in the 
labeled tree covering a node not labeled with a rule, or covering a node with 
a rule whose output register variable is not assigned to the same instance as 
the covering register. 

— overwrite problem: there is a node, labeled with a rule and with a schedule 
number comprised between the schedule numbers of two “communicating” 
rules using the value passing register as output or temporary node. 




476 Axel Bold and Vincent Vialard 



checker [ 






R 


TYPE+, 7. 


registers 


'/, Operator type aoid accessors 




Op 


TYPE+, 7. 
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TYPE+, 7. 
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accessor for numerical variables 


chits 
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Mode 
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[Node->nat] , 
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inp 


[Rule->f inite_set [RegVar]] , 


7« input register vars of the rule 
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[Rule->at_most_one [RegVar] ] , 


7« output register of the rule 


tmp 


[Rule->f inite_set [RegVar]] , 


7« temporary register variables 


rulemap 


[subrauige (1 ,nrule)->{r ;Rule 
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fYPE = Tree [Node] 
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fYPE = Tree [Op] 
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fYPE = [RegVar->R] 


7. register assignments 



Fig. 3. Parameterization of the PVS theory 
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If none of these situations is encountered, the values of the sub-expressions should 
be computed correctly, in time, stored and retrieved in the proper registers, and 
the temporary storages should be made in a secure manner. This should imply 
the correctness of the code generated. It will have to be established formally by 
an induction proof on the structure of the initial expression with help of the 
correctness property of the rule schemata. 

To express these properties we need a function that retrieves the rule node 
covering a subnode of the expression tree. The subset of nodes covered by a rule is 
defined by the predicate rule_covers_subtree? and the function covering_op 
retrieves the operator from the rule tree covering a given node. 

The PVS predicate wrong_input presented in Fig. 5 encodes the input prob- 
lem property for the rule rooted at t. It will be usable in place of the corre- 
sponding checker function (Fig. 4) in the proof thanks to the lemma presented 
in Fig. 6. We proceeded in a similar manner for the three other checkers. 

The proofs were done by structural induction on the expression tree. The 
induction hypothesis being implications, we had to write the checker functions 
in such a way that it is provable that the successful check of a tree implies 
the same for its subtrees (in order to “trigger” the consequence part of the 
implication of the hypothesis). These properties were themselves established by 
structural induction and sometimes needed other inductive lemmas (i.e. nested 
induction) . 

The proofs are not trivial (a few weeks were invested into specification, cor- 
rection, and proofs) but relatively short (1000 interactive steps for the whole 
theory, including TCCs). They could be further automated, using eventually 
user-defined strategies, but once established, thanks to the parameterization of 
the theory, it will not be necessary to re-work them. 

We encoded in PVS a subset of rule schemata for the translation from the 
MIS intermediate language to DEC-Alpha and instantiated the checker theory 
for such verifications. The small example presented in the previous section was 
successfully processed. 

The proof of the global process will be achieved by induction over the sched- 
ule number of the successfully checked labeled tree. We will consider a pair 
{code, tree) constituted of an assembly code sequence and an intermediate lan- 
guage expression tree. The assembly code is considered to be evaluated prior 
to the expression, bringing the machine in a state (values stored in registers 
and/or memory) in which the expression will then be evaluated. We start with 
an empty code sequence and the initial labeled expression tree, and the pair will 
be updated at each step to a new pair {code ++code', tree') as follows: 

— in the expression tree the selected node is replaced by a node labeled with 
its output register (according to the substitution) with two leaves as sons. 

— the assembly code part of the rule associated to the selected node (with its 
variables instantiated accordingly to the assignments) is appended to the 
existing code sequence. 

The equivalence between the two pairs will be established using the declarative 
properties that were shown to be implied by a successful check along with the 
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valcheck? (newrule? :bool, t : LTree ,r :RTree , a; Asg) : RECURSIVE bool = 

IF newrule? THEN 

CASES r OF "/. new rule 

leaf; _L, - shouldn’t be empty- 

node (vr,lr,rr) : 

CASES t OF 

leaf; _L, '/, - shouldn’t cover a leaf 

node(vt,lt,rt) ; "/. nor a register (continue verifying) 

-iregTCvr) A valcheck? (_L , It ,lt , a) A valcheck? (_L ,rt ,rr, a) 
ENDCASES 
ENDCASES 

ELSE y. old rule 

CASES r OF 

leaf ; "/, - is a leaf 

CASES t OF 

leaf ; T, '/, — and covers a leaf (ok, over) 

node(vt,lt,rt) ; "/, — or covers a node (continue verifying) 

IF rnum(vt) = 0 

THEN valcheck?(_L , It ,r , a) A valcheck? (_L ,rt ,r , a) 

ELSE valcheck?(T , t ,rtree (rulemap(rnum(vt) ) ) ,areg(vt)) 

END IF 
ENDCASES , 

node(vr,lr,rr) ; "/, - is a node 

CASES t OF 

leaf ; _L, '/. — shouldn’t cover a leaf 

node(vt,lt,rt) ; "/. — covers a node, verify if a value is 

IF rnum(vt) = 0 '/, required, and passed if necessary 

THEN -.Reg?(vr) 

ELSE Reg?(vr) A -lempty? (out (rulemap(rnum(vt) ) ) ) 

Aa(Reg(vr)) = aireg(vt) (choose (out (rulemap(rnum(vt) ))) ) 

A valcheck? (T , t ,rtree (rulemap(rnum(vt) ) ) ,areg(vt) ) 

ENDIF 7, ... and continue verifying 

A valcheck? (_L , It , Ir , a) A valcheck? (_L , rt , rr , a) 

ENDCASES 

ENDCASES 

ENDIF 

MEASURE lex2 (depth(t) , bool2nat (-inewrule?) ) '/, either t decreases 

7, or newrule? becomes true 

valuecheck?(t ; LTree) ; RECURSIVE bool = 

CASES t OF 

leaf; T, 7. real verification starts at the first node with a rule 
node(v,lt,rt) ; IF rnum(v) = 0 

THEN valuecheck?(lt) A valuecheck? (rt) 

ELSE valcheck? (T ,t , tree (rulemap(rnum(v) ) ) , areg(v) ) 
ENDIF 

ENDCASES 

MEASURE t by « 7, t is structurally decreasing 



Fig. 4. Operational specification of the value passing checker 
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'/, t is labeled with a rule that covers a subtree tl with a register node 
'/, but tl doesn’t store a value in the proper register 

wrong_input(t :LTree) ; bool = 

CASES t OF 
leaf : _L, 
node(vt,lt,rt) : 
rnum(vt) /= 0 

A 3(tl ; (rule_covers_subtree? (rtree (rulemap(rnum(vt) ) ) , t) ) ) ; 
Reg?(covering_op(rtree (rulemap(rnum(vt) ) ) ,t) (tl) ) 

A (leaf?(tl) 

V rnumCval (tl) ) = 0 

V empty? (out (rulemap(rnum(val (tl) ) ) ) ) 

V areg(vt) (Reg ( cover ing_op (rtree (rulemap(rnum(vt) )),t)(tl))) 

7^ areg(val (tl) ) (choose (out (rulemap(rnum(val (tl))))))) 

ENDCASES 



Fig. 5. Declarative characterization of the input problem 



% If t is valuecbecked then it does not have a subtree with an input 


'/, problem 




valuecheck_correct 


: LEMMA 


V(t;LTree) : 




valuecheck? (t) 


=► -i(3(tl ; (subtree? (t) ) ) : wrong_input (tl) ) 



Fig. 6. Link between operational specification and declarative property 



correctness properties of the translation rules. Intuitively, the properties implied 
by coverch.eck?(t) will be used to show the correct use of the rules and the 
others the proper storage and retrieval of the subterms values. 

5 Conclusion 

We described in this paper our approach to the problem of formally specify- 
ing a validation procedure of the results of a compiler back-end. We defined a 
generic operational PVS specification for such a program and proved declara- 
tive properties more usable for the global correction proof. The genericity of 
the specification should allow an easy use of the theory for various intermediate 
languages and target machines. 

The specification can be refined step by step into a PVS function close enough 
to the actual encoding of the checker in order to prove its implementation correct 
(it is the approach taken in [2] ) . If the checker is to be implemented in a higher 
level language, there must exist a correctly implemented compiler for it (this 
initial compiler is part of the Verifix project [5]). 
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Abstract. This paper describes how program-checking can be used to 
establish the correctness of a compiler front-end which was generated 
by unverified compiler construction tools. The basic idea of program- 
checking is to use an unverified algorithm whose results are checked by 
a verified component at run time. The approach not only simplifies the 
construction of a verified compiler front-end because checking the re- 
sult of the analysis is much simpler to verify than the verification of a 
high sophisticated compiler front-end. It even allows to define a notion of 
front-end correctness. Furthermore, we are still able to use existing gen- 
erators tools without major modifications. Additionally, this work points 
out the tasks which still have to be verified and it discusses the flexibility 
of the approach. 



1 Introduction 

In order to construct a verified compiler we have to consider not only the trans- 
formation and code generation phase which can be verified with respect to the 
source and target language semantics but also the analysis of programs. Usually, 
work on constructing correct compilers ignores this analysis phase. All seman- 
tic definitions of the source language are based on attributed structure trees 
obtained after semantic analysis, see e. g. [3,15,6,1]. 

However, in order to construct a correct compiler, the correctness of the 
analysis phase must not be ignored. This paper bridges the gap, i. e., we show how 
to construct a correct front-end. In fact, it is not trivial to define the correctness 
of the analysis phase. Basically it maps a character sequence to attributed syntax 
trees. But how to define correctness of this mapping? 

It is common to define semantics of programming languages on abstract 
and/or attributed syntax trees. Hence, in order to have a complete language 
definition, the relation between the source text and the attributed syntax tree 
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has to be specified. We expect this relation (p to be available. Usually, compiler 
writers prefer to use their own representation of attributed syntax trees and 
base the dynamic semantics on them. For a correct compiler, it must be proven 
that the programming language semantics used in the compiler preserves the 
programming language semantics as defined by the language definition. In this 
paper, we assume that this is already being done. Hence, we have to ensure that 
the relation between source text and attributed syntax trees is implemented 
correctly. 

Instead of proving the correctness of the analysis phase, we check the cor- 
rectness of the results produced during the analysis dynamically. For simplicity, 
we assume that the static semantics is specified by an attribute grammar AG, 
and the relation (p between source text and attributed syntax trees is specified 
inductively over the structure of the syntax trees. The basic idea of front-end 
checking is first to check the semantic analysis where it is sufficient to check 
that for every attribution rule rii.a ^ f{m\, . . . , mk) in AG the corresponding 
attributes of the attributed structure tree define an equality. Second, if the result 
of the semantic analysis was accepted, we check scanner and parser by checking 
whether the source text is related by (p to the abstract syntax tree. Our ap- 
proach allows the use of front-ends generated by unverified tools or front-ends 
implemented by hand. We do not assume anything about the implementation 
language of the front-end. Especially we do not assume that it is implemented 
in a language for which there exists a verified compiler. Of course, the checker 
itself has to be verified. 

In our case study we use the cocktail tool box [8] which generates C programs. 
Our implementation language for the checker part is Sather-K [11], a type-safe 
object-oriented language with generic classes (similar to templates in C++). 
The benefit of our approach is illustrated by the number of lines of codes which 
have to be verified in order to prove the front-end implementation correct. The 
generated front-end of our case study is implemented by 22.000 lines of C code 
while the checker consists of 1.300 lines of Sather code. 

The following section introduces the idea of program checking in general 
and discusses related work. In Section 3 we present an architecture for front- 
end checking and describe the particular components of the checker in more 
detail. We examine the checking of semantic analysis, we describe the checker 
for scanner and parser, and we discuss correctness properties of each component 
of the checker. In Section 4 we present an example and draw conclusions in 
Section 5. 

2 Basics and Related Work 

The idea of program checking shows how to construct partially correct programs 
without direct verification of the program. Instead, it uses a verified checker 
as a filter. Consider a program tt with input x and output y. Let P{x) be a 
precondition of tt and Q{x, y) a postcondition. A program tt is partially correct, 
iff for every x satisfying P{x) either tt refuses a; or tt computes an output y such 
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that Q{x, y) (i.e. {P(a;)} y := tv{x) {Q{x, y)} in Hoare- Triple notation). The idea 
of program checking can be summarized by the following function tt': 

fun tt'{x : T) : T' is 
y := 7r(a;); 

if check-Tv{x, y) then return y, 
else abort 

end 

The boolean function check . tv must imply the postcondition Q. The following 
theorem shows the validity of the approach. 

Theorem 1 (Program Checking). Let tv{x : T) : T' be an unverified program 
without side-effects, check.Tvfx : T,y : T') : bool he a side-effect free function 
satisfying {P(a;)} 2 : := check.Tvfxpy) {z = true ^ Q{x,y)f. Then, it holds 
{P(a;)} y := 7r'(a;) {Q{x,y)} 

Proof. We sketch the proof. It can be formal using standard Hoare calculus. 
Since tv is side-effect free, the input x remains unchanged. If tv' does not abort, 
it returns y. The y returned is the same as the input in check. Tv{x,y), because 
check.Tv{x, y) is side-effect free. Furthermore, when y is returned it must hold 
check.Tv{x,y) = true. Hence, it holds Q{x,y). 

Hence, the only assumption on tv is the side-effect freeness. No further assump- 
tions on TV are made. The function tv' therefore provides a bootstrapping approach 
to construct partial correct programs. It is useful to apply the approach if the 
formal verification of check.TV is much easier than that of tv or the size of tv is 
much larger than the size of check. tv. However, the difficulty is the assumption 
on the side-effect freeness of tv. We will call this property of a program being 
side-effect free “wrap” -property. 

Related Work. Our checker approach is closely related to the work of M. Blum 
on result- checking [2,16] and the ideas of [9]. A more detailed discussion of the 
theoretical aspects of our approach can be found in [10]. 

Program checking is already used in compiler construction for checking prop- 
erties necessary to establish correctness of a transformation. Necula and Lee [14] 
describe a compiler which contains a certifier that automatically checks the type 
safety and the memory safety of any assembler program produced by the com- 
piler. The certification process detects statically compilation errors of a certain 
kind but it does not establish full correctness of the compilation. Nevertheless, 
this work shows that program checking can be used to produce efficient imple- 
mentations with consideration of safety requirements. 

3 The General Approach 

Figure 1 describes our architecture for compiler front-end checking, white boxes 
denote components which can be used without verification while grey boxes 
denote parts which have to be verified in order to construct a correct front-end. 
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The architecture of the analyzer is the typical architecture of a compiler 
front-end. It accepts the program U represented as a character sequences and 
produces an attributed abstract syntax tree. The scanner produces a sequence of 
tokens, i.e. the internal representation of key-words, identifiers, constants, special 
symbols such as etc. We use the word symbol to denotes this character 

sequence. The scanner may be generated by regular expressions RE' describing 
the symbols. It removes white spaces and comments. The parser produces an 
abstract syntax tree. It may be generated by a context-free grammar G' . Finally, 
the semantic analysis enriches the abstract syntax tree by attributes. Again, the 
semantic analysis may be generated by an attribute grammar AG'. The attribute 
grammar AG' used for the generation of semantic analysis needs not to be the 
same than AG, because semantic analysis generators may require special classes 
of attribute grammars. 




error 



Fig. 1. A general architecture for checking compiler front-ends 



The checker accepts the program II and the attributed abstract syntax tree 
AAST as inputs. The SA Checker verifies the validity of attribute values w.r.t. 
the attribute grammar AG from the language specification. If the check does 
not reveal an error then the abstract syntax tree is passed to the Unparser 
which uses the relation (p to compute a sequence of symbols. This sequence of 
symbols is taken as a reference the original file is compared with. Of course 
the comparison has to ignore white spaces and comments. If the comparison 
succeeds the program was parsed correctly. Otherwise the program is rejected. 
In fact, this does not mean that the program was compiled faulty. It just means 
that the checker was not able to establish the correctness of the compilation. Of 
course it is our goal to build a checker which is able to check the correctness 
of all compiled programs. The interface basically implements the functionality 
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of the function tt'. It ensures the requirements of Theorem 1. Thus, the checker 
implements the following function: 

fun check -frontend{n : Char* , AAST : AG) : BOOL is 

if -icheck -semantic-analysis{AAST) then return false; fi; 
s := unparse(AAST); 
return compare {II, s); 

end; 

In Section 3.1 we discuss different implementations of the interface, Sec- 
tion 3.2 describes how to implement the checker for semantic analysis, and Sec- 
tion 3.3 describes the implementation of the checker for the scanner and parser. 



3.1 Safe Communication of Checker and Pront-End 

As Theorem 1 shows that the function tt to be checked - in our case the compiler 
front-end - must be free of side-effects. This can be ensured by strictly separating 
the memory spaces of the compiler and its checker. If the operating system is 
assumed to be correct^ there are several alternatives to make sure that it is 
impossible for the front-end to write in the memory of the checker. 

— If the implementation language does not allow pointers to the memory, we 
are able to prove that the compiler behaves safe. 

— If checking and compiling are two parallel processes with different memory 
spaces the operating system assures that memory of one process can not be 
altered by another process. Nevertheless, we have to verify and implement the 
protocol on which the two processes communicate. In our implementation of 
this protocol, compiler and checker communicate by mutual file access. The 
attributed structure tree is written to a file in a general interchange format 
which is then translated to an internal representation. This representation of 
the AST is reliable when the check has succeeded. The interchange format 
is defined in [12]. 



3.2 Checking Semantic Analysis 

The general idea of checking semantic analysis is to interpret the attribute defini- 
tion rules R of the language specification AG as equations on the corresponding 
attributes. Instantiating these equations with the attribute values computed by 
the compiler (using AG') leads to a set of equations. Semantic analysis worked 
correctly if all these equations together with the conditions G on attribute values 
are fulfilled. 

The definition of static programming language semantic is specified by an 
attribute grammar AG. An attribute grammar is a quadruple AG = {G, A, R, G) 

^ Compiler verification does not deal with hardware verihcation or verihcation of the 
operating system. Though correctness of the base system is essential for the correct- 
ness of the global system this is beyond the scope of our work. 
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where G = {N, T, P, Z) is a context-free grammar describing the abstract syntax, 
A = UxeTuTV^^’ attributs belonging to X, R = \Jp^p R{p), 

R{p) is a set of attribution rules for the production p £ P, C = Upgp C'(p), C{p) 
is the set of conditions for the production p <E P. We write X.a to indicate that 
a e Ax- The attribution rules for production p ■. Xq ■.:= Xi . . . X^ have the form 
Xig.ao r- f{Xi^ .ai, . . . , Xi^.ak), 0 < i\,...,ik < n. A condition for p has the 
form c{Xj^.bi,...,Xj^.bi), 0 < n. 

An attributed abstract syntax tree t is correctly attributed iff for each node 
pq of t with children i/i, . . . ,i/n obtained by production p : Xq ::= Xi ■ ■ ■ the 
following properties are satisfied: 

1- i^io -O-io = -o-ii -0-ik ) ®^ch attribution rule 

’^0 ^ f ’^1 ) • * * 5 ^ik ^ -^(P) * 

2. c{vp -bi, . . . , Vj^ .bi) = true for each c{Xj^ .6i, . . . , Aj, .bi) e C{p) 

Here v.a denotes the value of attribute a of node v. It must be a € Ax if n 
corresponds to non-terminal X . 

Therefore, a checker for semantic analysis must check whether all attributes 
oi AG are computed, and whether (i) and (ii) are satisfied. This simply can 
be done by traversing the attributed abstract syntax tree (in any order) and 
perform the checks. Thus: 

fun check -semantic-analysis{AAST : AG) : BOOL is 
for each instance node vq of AAST do 

let Xq be the non-terminal corresponding to j/q! 

for a e Axo do 

if VQ.a has no value then return false; fi 
let i/i, ... , 1 /n be the children of vq produced 
according to production Xq ::= X\ - ■ ■ 

for each attribution rule Xi^ .Oq ^ •«!, ■ ■ ■ , Xi^ .Ok) € R{p) do 

if Vi^ .ao yf f{vi^ -Oi^ ,^ik -O-ik ) then return false; fi; 
for each condition c{Xj^ .6i, . . . , Aj, .bi) € G{p) do 

if .6i , . . . , Vj^ .bi) = false then return false ; fi; 

od; 

return true; 

end; 

Remark 1. The checking of the attribution is much simpler than computing the 
attributes according to a special evaluation order. Thus, the attribute grammar 
AG defined by the language designer may be different from the attribute gram- 
mar AG' used for constructing the semantic analysis. While AG' has to have 
properties useful for generation, e. g. AG' has to be ordered, AG needs only to 
be well-defined. 

3.3 Checking the Correctness of Scanner and Parser 

Semantic analysis together with scanning and parsing implements a function 
from character sequences to attributed structure trees. It yields a unique AST 
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for a particular program. After the checking of the attributed structure tree it is 
safe that the computed attributes are consistent with the language specification. 

We ignore the attributes of the AAST. The Unparser implements the relation 
4> defined by the language designer. It produces a sequence of symbols, i. e. a 
sequence of character sequences representing the relevant units of a program. The 
symbols of the reference sequence have to occur in the same order in the original 
program. Therefore the unparser is correct iff unparse{t)) for each attributed 
abstract syntax tree t. The correctness of an unparser might be ensured by a 
checker. 

The Comparator processes the sequence of symbols produced by the Unparser 
and compares it with the original file. Informally spoken, the comparator shifts a 
kind of window over the character sequence. The information about the context 
of this window is used to determine the actions of the Comparator: ignore white 
spaces, add white spaces, over read comments, report an error etc. 

Some properties of existing programming languages require additional check- 
ing capabilities: 

— Valid symbols which are prefix of other valid symbols require consideration of 
significant white spaces in order to check the principle of the longest match. 

— Priorities of operators are usually defined informally and are not represented 
in the abstract syntax. Thus they have to be checked separately. 

— Different notations of the same numbers have to deal with in any case since 
the actual values of constants are processed during compilation^. 

— Superfluous symbols, e. g.E or the number of parentheses, can be ignored 
during the comparison. 

Remark 2. This approach checks also implicit rules such as the principle of 
longest match and operator priorities. Both are mandatory for the correctness of 
the front-end. However, the checker might reject legal attributed abstract syn- 
tax trees. For example, if the unparser just includes the semantically necessary 
parantheses in expressions and the symbol sequence contains more paranthesis, 
the comparator returns false. In order to improve the quality, the unparser has 
also to produce semantically superfluous symbols. This information can be ob- 
tained from the derivation tree of the syntactic analysis. For example we could 
save the number of reductions performed to accept a parenthesized expression 
in an attribute. During the unparsing we create paranthesis according to the 
number saved in the tree. 

With such extensions, it is possible to define correct checkers for many of 
the existing programming languages. Though the checkers are not complete, we 
can improve them using more sophisticated comparison strategies. The trick 
of generating a programming language instead of parsing it eliminates a lot of 
problems, e. g. ambiguities of the grammar or special properties of the acception 
mechanism (LL or LR), which make scanning and parsing quite complicated. 

^ In order to preserve simplicity of our checker we decided to check correctness of the 
transformation of numbers separately. 
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The Comparator is allowed to ignore characters which do not carry informa- 
tion. Which characters carry information depends on the actual programming 
language. Even white spaces may carry information in some special cases. This 
has to be considered to establish correctness. In general, one has to prove that the 
comparator accepts the same AST with or without this additional information. 
The comparator is implemented by the following algorithm: 

fun compare{n : Char*,s : Symbol*) : BOOL 
while ~^empty{s) V -iempty{U) do 
remove -white spaces (77) ; 
if -^match{head{s), n) then return false; fi; 
if ^is -delimiter {last)head{s))) A ^is -delimiter [head {II)) 
then return false fi; 
s := tail{s); 

od; 

remove -white spaces (77) ; 
return empty (s) A empty {II); 

end 

Compare considers in turn each symbol of s. Such a symbol is compared 
with the first characters of 77. For this purpose, first all superfluous characters 
are removed from 77, e.g. white spaces, newline characters, and comments. This 
is implemented by the function remove -white spaces which implements a finite 
automaton specified by the language definition. Then, the function match com- 
pares the symbol with the characters at the front of 77. Again, match implements 
a finite automaton defined by the language report. The function match removes 
the accepted characters from 77. Numbers may require a special treatment since 
different character sequences may represent the same number, e.g. leading zeroes 
are superfluous, leading zeroes or the plus sign in the exponent of a floating point 
number may be superfluous, or trailing zeroes in the mantissa of a floating point 
number are superfluous, match must take into account these properties. As a 
side-effect, the function match removes the accepted characters from 77. After 
each match of head{s), the first character of 77 must be a delimiter, provided 
head{s) is not a delimiter, because each non-delimiter symbol be followed by 
a delimiter. Delimiters are specified by the language definition; characters like 
white spaces, newlines, paranthesis, semicolons etc. are typical delimiters. After 
termination of the while-loop and removing all leading superfluous characters 
from 77, the symbol sequence must be empty and the character sequence must 
be empty. 

4 Example 

Our example language defines simple expressions with variables, constants, ad- 
dition, and multiplication. The attribute grammar AG in Fig. 2 describes the 
abstract syntax of a simple language for expressions. The attribution computes 
the “expression is constant” attribute {h = 1 or b = 0). Multiplication has higher 
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precedence than addition. Addition is left-associative. The mapping (p specifies 
the concrete syntax for the example language. For simplicity, we omitted the 
parantheses in the specification of (p. Additionally, it must be specified that if 
a multiplication has a subtree representing an addition, the addition must be 
enclosed by “(”. . . “)” and that any other expression or subexpression may have 
an arbitrary number of enclosing paranthesis. 



E ::= Ident {b ^ false, id ^ STRING} 

I IntConst {& ^ true, value ^ INT} 
I +(Ai,A 2) Aba} 

I +(i?i, Aa) {b ^ bi A ba} 



4>{Var) = Var.id 
4>{V alue) = Val. value 
</>(+(Ai,Aa)) = <P{Ei) '+' </.(Aa) 
</>WAi,Aa)) = </>(Ai) 'V </>(Aa) 



Fig. 2. Attribute grammar and mapping <p from abstract to concrete syntax 



The left-hand side of Fig. 3 shows the AST representation of the expres- 
sion a* (3 + 4) + 6, the right-hand side shows the set of equations derived from 
the attribution of the AST. The superscript of an AST node describes the at- 
tribute values computed by the semantic analysis. The subscript is a unique 
number which relates the AST node with an equation on the right. The function 
correctly -attributed instantiates the equations corresponding to attribution rules 
of the abstract syntax with the attributes computed by the semantic analysis 
and then checks the consistency of the formulae. The equations in our example 
are consistent, cf. 3. Thus, semantic analysis worked correctly and the func- 
tion unparse, derived from (p, is invoked. It traverses the AST and produces the 



0,a 

^Ident 

U 




1,4 



, IntConst IntConst . 

4 5 



0; 0 = 0 A 0 

1; 0 = 0 A 1 

2 ; 0 = 0 

3; 1 = 1 A 1 

4; 1 = 1 

5; 1 = 1 

6 ; 0 = 0 



1. b A 6.b 

2. b A 3.b 
Ident . b 
4.b A 5.b 
IntConst .b 
IntConst .b 
Ident . b 



Fig. 3. AIF representation and attribute equations for a * (3 + 4) + 6 



stream ”a”, ”(”, ”3”, ”+”, ”4”, ”)”, ”+”, ”b”. Since unparse considers op- 

erator precedences the parentheses were inserted. Thus, the original expression 
is produced which establishes the correctness of syntactic analysis. 
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We show how the checker reacts on erroneous parsers. Suppose, the parser 
had accidently ignored the parantheses (or the scanner removed accidently the 
tokens), i.e the syntax tree in Fig. 4(a) would be produced by the erroneous 
parser. The unparser produces the stream ”a”, ”3”, ”+”, ”4”, ”b”. 

In contrast to the above example, parantheses are not included because of the 
priorities. The comparator recognizes that this stream differs from the input 
text. 

Assume now, the parser accidently exchanged priorities of and ”+”. 
Then, it produces the syntax tree in Fig. 4(b). The unparser produces the stream 

„y, 

text. 



0,a 

^Ident 



A 

Ident ^ 



A 



A 



IntConst , 



1,4 



IntConst , 



1,3 



(a) Wrong Parantheses 




, IntConst IntConst . 

4 5 

(b) Wrong Precedences 



Fig. 4. Erroneous Syntax Trees 



5 Conclusions 

We addressed the problem of compiler verification for real-world compilers and 
languages with the focus on the analysis phase, and presented a concrete front- 
end verification framework. Our approach emphasizes the software engineering 
aspect, because it bridges the gap between the verification of such a complex 
software system and its practical implementation, especially with generators. 

The proposed compiler construction framework allows to implement verified 
front-ends down to correct machine implementation. The main idea is to assure 
correctness of the implementation by introducing runtime program-checkers that 
check the result of syntactic and semantic analysis. The result of such a ‘checked’ 
analysis-phase is an attributed abstract syntax tree, that carries all the informa- 
tion needed for the transformation phase. We want to stress here again, that this 
checking is independent of the interleaving of semantic and syntactic analysis. 
Even if the syntactical structure is determined only after the semantic analysis, 
the checking can be performed independently. Measurements in our case-study 
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C/Sather-K 




Binary Prog. 




Lines 


Byte 


Byte 


Generators COCKTAIL 


110.000 


2.4 MB 


1.2 MB 


Generated C Code 
Impl. IS-Ftont-End 


22.000 


600 KB 


300 KB 




500 (Parser) 


14 KB 




Checker (Sather-K) 


+ 100 (Compare) 


3 KB 


200 KB 




+700 (AST) 


30 KB 





The first line shows the amount of C code of the compiler toolbox COCKTAIL used to generate the unverified 
front-end. This includes the involved generators for scanner, parser and abstract tree construction. The second line 
shows the same information for the herewith generated front-end (C code) in lines and bytes, and the size of the 
compiled program. To obtain a verified implementation without checking one would have to verify the generated C 
code or the generators. The ’Checker’ line shows the amount of code (Sather-K) needed to construct the fully 

functional program checker. 

Table 1. Case study: Lines of program code to verify for a program-checked 
front-end 



(see table 1) show the practicability of our approach. The number of lines to 
verify is decreased by a factor of 80 compared to the generator source and 1:17 
compared to generated C code. In our case-study we compile a C-subset language 
IS [4,13] to DEC-alpha machine code. 

Though we did not discuss the correctness of the transformation and code 
generation phase, this is part of our work in the VeriGx project. VeriGx is a large 
scale case study in program verification with the major goal to verify not only 
specification and high level implementation of compilers, but also to guarantee 
the correctness of their final binary executables on hardware, cf. [7]. State of 
the art compiler construction uses complex and high sophisticated algorithms 
in order to achieve efficient code. Assuring correctness by checking their results 
enables us to use these algorithms in our verified compiler implementation and 
even to generate them with available unverified compiler generators [5]. 

Acknowledgements This work is supported by the Deutsche Forschungs- 
gemeinschaft project Go 323/3-1 Verifix (Construction of Correct Compilers). 
We are grateful to our colleagues in Verifix. 
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Abstract. Integration of non formal methods, notations and tools with 
formal ones is a promising way of linking scientific results to the daily 
work of practitioners. In this paper, we present a formal notation based 
in a synchronous reactive execution semantics (Synchronous Reactive 
System) for graphical specifications (SA/RT models). We use the Syn- 
chronous Reactive System as intermediate format to formally verify graph- 
ical specifications using the SMV model checker. We deal with the state 
space explosion problem using modular verification. 



1 Introduction 

Structured Methods [19], also known as Structured Analysis for Real-Time 
(SA/RT) are a widespread graphical formalism that is adequate to model Re- 
active System and it is supported by a high number of commercial CASE tools. 
But most of them lack analytical capabilities (usually limited to syntax checks 
such as balancing or simulation). 

The original (informal) definition of the semantics as proposed by Ward and 
Mellor is inspired in the execution rules of Petri nets. In this paper, we will 
use a more up-to-date, deterministic and causal semantics, similar to the one 
implemented in STATEMATE [10] or RSML [12]. The essential difference with 
regard to Ward’s approach is that more than one transition can be executed in 
parallel at each step. 

Little work has been made in the model checking of this type of graphical 
specifications. In a previous work [18] we have used SA/RT methods in con- 
junction with SMV [15], in which the model is executed as a set of interleaved 
processes. In [7] and [8] Statecharts are used, but the semantics is not based 
on the concept of micro and macro — steps and not use modular verification. 
Anderson et al. [2] [3] have used SMV to verify requirements written in RSML. 

* This work has been funded by the ’’Comision Interministerial de Ciencia y Tec- 
nologia” (Spain) under project EDIC (TIC96-0652) 
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They perform a manual translation and verify some interesting properties (safety, 
transition and function consistency). 

Most of the current approaches to verification of synchronous systems (see 
e.g. [If]) perform a first step in which a global transition graph is elaborated 
and verification is performed on this global graph (the same as the one that is 
produced by the compilation process in Esterel programs [5]). But the verifica- 
tion using a precompiled transition graph does not resolve the state explosion 
problem, because if the system is composed of different subsystems that are not 
tightly coupled, the total number of states increases exponentially. In such cases, 
it is very important to partition the model and to perform separate verifications 
on each part of the model (modular verification). 

In the Section 2 we describe the computational model of Synchronous Re- 
active Systems which we use as intermediate format to compile the graphical 
specification. In the Section 3 we sketch the procedure of translation from Syn- 
chronous Reactive System into the language accepted by a model checker (SMV 
[15]), and show how we can perform the modular verification. Finally, in Sec- 
tion 4 some conclusions and future work are presented. 

2 The Pramework of Synchronous Reactive Systems 

In this section, we present a brief introduction to the SA/RT models and we 
show the underlying computational model that we denoted Synchronous Reactive 
System (SRS). 

2.1 SA/RT Methods 

SA/RT is a short name for Structured Analysis methods with extensions for 
Real Time. Using Structured Methods we can view the model of the system as a 
leveled set of diagrams that include concurrent processes and the communication 
between them. Each process communicates with others and with the environ- 
ment using data and control flows (in our model only control flows are needed 
and we will denote them events). Each process is decomposed into a diagram 
showing a more detailed view. The primitive control processes (processes which 
not decompose in other) are specified using State Transition Diagrams (STDs). 



2.2 State Transitions Diagrams 

In the SA/RT methods [19], the behaviour of a primitive control process is 
defined using a State Transition Diagram or STD. An STD contains all states 
that the process may reaches and all transitions that it may performs. In the 
rest of this paper, we use the term ’’process” and ”STD” indistinctly, due to 
there is a mapping between a process and its STD. 

Definition 1. An STD is a 5-tuple < U,so,I,0,6 > where 

— S is the set of states, 

— So is the initial state of STD, So € 2J, 
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— I is the set of input events that the STD receives, 

— O is the set of output events that the STD produces, I D O = (Z>, 

— 6 is the transition relation, dCjJxIxOxS. 

Usually, we denote the transition r = (s, c, a, s') G 6 using the notation 
s — s' , which means that STD executes the transition r when it receives 
the input event c (see Remark 1), changing to state s' and producing the set of 
output events a C O. In the context of individual transitions, we will refer to 
the pair (c, a) like label of transition and we will refer to c like condition and a 
like action. All STD implicitly has a control variable or control state tt, which 
denotes the local state of process (initially tt = sq)- A transition r = (s, c, a, s') 
is enabled if (tt = s) and the evaluation of c is true. The set of enabled transitions 
in a state s is denoted as enabled{s). 

Remark 1. The original syntax from Ward [19] specifies that a condition in an 
STD must be composed only of control flows (individual events). In order to 
achieve a higher expressiveness of the specification, we allow the conditions to 
be formed by logical expressions of events of I, values of the states of other 
processes and the proposition true (equivalent to the ’’blank” condition in the 
graphical model) . 

2.3 Synchronous Reactive Systems 

An SRS consists of a set of STDs interacting over a set of input events and a set 
of output events. The events that communicate STDs we denote them internal 
events due to this events are not observable out of the SRS. 

The semantics adopted to describe the behaviour of a SRS is related with the 
concepts of Berry ’s synchronous hypothesis [5] (the system reacts instantaneously 
to external events) and the semantics of Micro/Macro Step in STATEMATE [10] 
and RSML [12]. Basically, we can view the execution of the SRS as infinite series 
of macro — steps that produce sequences of output events in response to input 
events, and internally, the execution can be viewed as a chain of micro — steps. 
At each one, the system will reacts to the input events producing output events 
and internal events that initiate other micro — step until no more micro — step 
can be taken. 

Definition 2. A Synchronous Reactive System (SRS) <P is a 5-tuple 
< A,GE, IE,OE, — where 

— A = {Ml, M 2 , . . . , Mn} is the set of STDs that compound <P 

— GE is the set of internal events that communicate the STDs in A, 

n n 

GE = {\jE)n{[jOi) 

i=l i=l 

— IE is the set of input events that # receives of the environment, 

n 

IE =\JE-GE 
2 = 1 
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— OE is the set of output events that <P produees to the environment, 

n 

OE =\JOi-GE 

i=l 

— — is the transition relation of # whieh deseribes the semanties outline 
above and we will define next. 

The initial state of <P is formed by the set of the initial states of the STDs in A, 
So = {so, si, • • • , Sn}- An state (global state) S' of # is composed by the control 
states of the STDs in Z\, S = (tti , . . . , 7t„). We denoted by C = (S, IE, OE, GE) 
as the configuration of <P. The set of all possible configurations of <P is denoted by 
Global{(I). We will describe the transition relation — Global {tl) x Global {tl) 
in basis to the following inference rules (similar to [13]): 



Advance Rules: Applies to an STD M{ if it has an enabled transition in a 
state of the current configuration. If the STD have multiples enabled transitions 
in the configuration, one of them is taken non-deterministically: 



(iTi = Si) A (3Ti : Si 



s) G Si/xi G enabled{si)) 



((tti, ...,Si,..., 7T„), IE, OE, GE) 



((tti,.. 



,'Krf),(d,OE\jAi,Af) 

( 1 ) 



where = {e G a^/e G OE} and A) = {e G a^/e G GE}. 

If various STDs of SRS have enabled transitions, then each of them execute 
simultaneously: 



{ui = Si) A (3Ti : Si — s' G di/vi G enabled(si)) 



(ttj = Sj) A (3tj : Sj — >1} s' G dj/xj G enabledjsj)) 
(( tti , ...,Si,...,Sj,...,TTn),IE, OE, GE) — 

((tti, . ..,s'i,...,s'j,...,Trn),(d,OEuAiLI Aj,A) U A' ) 



(2) 



Stuttering Rule: If an STD Mi is in a state in the current configuration and 
it does not have any enabled transition, then it consumes events but it does not 
produce events. This notation is adequate to represent the concept of reactivity 
(in any state, there exists at least one transition to execute): 



{xi = Si) A {3xi : Si 



s) G di/xi fz enabled(si)) 



((tti, . ..,Si,...,Xn),IE, OE, GE) — >g ((tti, ...,Si,..., 7t„), 0 , OE, 0 ) 



( 3 ) 



The above rules show the execution of the SRS at level of micro — step. When 
all STDs Ml, . . . , Mn in the SRS only can to advance executing the rule 3, the 
SRS has reached to an stable configuration (no more transitions can be taken) . 

At level of macro — step, the execution of SRS can be viewed like a sequence 
of stable configurations, where at each one, the SRS receives input events IE and 
produces output events OE: 



G = {S, IE, 0 , 0 ) G' = {S', 0 , OE, 0 ) 



( 4 ) 
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where S and S' represent the global states of SRS before and after of the macro- 
step, C is an stable configuration and — represents a chain of micro — steps: 

C — C = C — Cl — C . . . — C — C C (5) 

3 Verification of SRS 

The intermediate format presented in the later section is translated into the 
language accepted by the SMV [15] model checker. In the SMV language, we can 
specify the operational model and check its desired properties written in CTL 
Temporal Logic [6], The execution of an SMV specification can be viewed as a 
sequence of steps that change the values of variables according to the transition 
relation of the automata represented by the SMV code. We will outline the 
translation procedure of the semantics ( — and — to SMV and how we 
include support for performing modular verifications. 



3.1 Translating SRS into SMV 

The execution of each macro — step consists of a first step, in which the changes 
produced in the environment are perceived, and a sequence of micro — steps, 
until a stable configuration is reached. Since SMV executes step by step, without 
any difference between steps, we must differentiate the first step from the others 
using a special variable named MicroStep. The following pseudo-code reproduces 
the behavior of — C : 

if MicroStep = 0 then 

Allow changes in external inputs and set MicroStep = 1 

else 

if some transition can be executed then 
Perform a micro-step by executing transitions 

else 

Set MicroStep = 0 

end if 

end if 

We use a boolean variable for each event and a variable for representing the 
state of process (namely, tt). Changes of each variable representing an external 
input event ini G IE are performed by sentences that set the event to a random 
value only when the value of MicroStep is 0. Otherwise its value will be set to 
0 (external input events can influence only the first micro — step) : 

next (ini) := case 

MicroStep=0 : fO,!}-: 

1 : 0 ; 



esac ; 
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The execution of each micro — step ( — follows a similar schema, but, since 
we need to know if some transitions where enabled or not in order to decide 
whether to continue the chain of micro — steps or not, we use an additional 
variable TP for each process, that represents the transition that will be executed. 
For instance, if two transitions t\ and T 2 leave the state SO towards SI and S2, 
having conditions Cl and C2 respectively, we first select the transition to be 
executed with a direct assignment (if no transition can be executed, then TP is 
set to 0): 

TP := case 

MicroStep=l & pi=S0 & Cl : 1; 

MicroStep=l & pi=S0 & C2 : 2; 

1 : 0 ; 
esac ; 

The SMV case sentence is deterministic: it selects the first row that has the 
condition true. If we want the selection to be non-deterministic, we can do as 
shown in [18]. 

Output events and next states are set according to the value of this variable. 
For instance, if event outl G OE is sent as a consequence of the execution of 
transition ri and also as a consequence of another transition rs, the next value 
for outl will be: 

next (outl) := case 
TP=1 I TP=3 : 1; 

1 : 0 ; 
esac ; 

The last term sets the value to zero for the same reason as in external input 
events. 

The next state (assuming that variable pi holds its value) will be coded as: 

next (pi) := case 
TP = 1 : SI; 

TP = 2 : S2; 

1 : pi; 
esac ; 

The chain of micro — steps finishes when no transition can be executed. For 
instance, if we have two processes in the SRS, each one having its own variable 
(TPl and TP2) which indicates the transition executed, we will have: 

next (MicroStep) := case 

MicroStep=0 : 1; — environment has changed 

TP1=0 & TP2=0 : 0; — end of macro-step 

1 : MicroStep; 
esac ; 
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3.2 Approach for Modular Verification 

Due to the state explosion problem, when the size of the system grows, it is desir- 
able to be able to perform local verifications in separate components and deduce 
some global property for the whole model. This procedure can be called Mod- 
ular Verification and is based on the Abadi and Lamport composition theorem 
[ 1 ], also known as rely-guarantee or assumption-commitment rules. If a model # 
can be decomposed into the parallel composition of two (or more) components 
(SRSs) [^i||#2], we can perform the verification of local properties (<^i, (f>2) for 
each component (#i or ^2, respectively) assuming some kind of behavior for the 
other (abstracted) component. For instance, we can prove that 4>2 is true for 
the component <p2 assuming certain behavior (pEi of the abstracted model <Pi, 
and symmetrically for #1. If we also prove that 4 >ei is true for #1 (discharge the 
assumption) and its symmetric (<^£2 is true on ^>2), then property (pi A <p2 will 
be true for the whole model <P provided that assumptions are safety properties 
[!]• 

When we divide the model into different SRSs (each one groups several pro- 
cesses in the SA/RT model), internal events that communicate them must be 
treated differently to the others. We name these Ghost Events and at each 
micro — step we set their values to random ones. A safety property which is 
true in the component will also be true in the whole model (the converse is not 
true, since this simplification introduces additional computations that may not 
be present in the whole model) . 

When including ghost events, the termination of each macro — step (deter- 
mined by the portion of code that gives a value to the variable MicroStep) must 
be modified to take into account the fact that if at some step no transition is 
enabled, the abstracted model may still be executing a transition that will send 
a ghost event at the next micro — step. The end of the macro — step must include 
additional conditions to avoid finishing when the next value of some ghost event 
has been sent. For instance, assuming that we have two variables (gl and §2) 
representing ghost events, the above fragment will be: 

next (MicroStep) := case 

0:1; — environment has changed 

next(gl)=0 & next(g2)=0 

& TP1=0 & TP2=0 : 0; — end of macro-step 

1 : MicroStep; 

esac ; 

When we wish to specify some kind of assumption, we use a set of simple rules 
coded as sentences like: ASSUME{cond,g,v), which states that if condition 
cond is true, then variable g must have the value v. Assumptions are included in 
the portion of code that sets a value to the ghost event as shown below (recall 
that the case sentence selects the first condition that is true): 

next(g) := case 

MicroStep=l & cond : v; 
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— other assumptions . . . 

MicroStep=l : {O,!}-; — possible change 

1 : 0; 
esac ; 

3.3 Specification of Properties 

System properties must be checked at the end of each macro — step. It suffices 
to transform a formula AG{4>), where 4> is another CTL formula or proposition, 
into 

AG{MicroStep = 0 ^ (p) 

Due to a high number of properties of interest being input /output responses, 
which relate input with output events, and the values of input events are main- 
tained only at the first micro — step, we also provide a predicate WAS0{f) which 
is true if / was true at some micro — step of the current macro — step. Using an 
additional variable for each one, the translation is straightforward. 

When we discharge an assumption (commitment) in the form 
ASSUME{cond, g, v), we use a CTL formula that is checked at each micro— step 
like: 

AG{MicroStep = 1 ^ {cond g = v)) 

An essential property of this type of model is the absence of a kind of livelock 
situation in which the system is executing an infinite chain of micro — steps (step 
termination) . This situation is checked with the following CTL formula: 

AG{MicroStep = 1 ^ AF{MicroStep = 0)) 

When we try to prove this property in a component that has some ghost 
events as input, an important risk is that of falling into false infinite loops caused 
by an infinite sequence of these events. In that case, a common assumption is to 
prevent a ghost event from appearing more than once (or more) in a macro— step: 
it suffices to use assumptions like ASSUME{WAS0{g),g,0). 

4 Conclusions 

The approach taken for modular verification of synchronous reactive system 
allows mitigate the effects of the state explosion. Although a compilation of the 
synchronous model previous to its model checking (as in [If]) is more efficient 
due to the elimination of all micro — steps and so, its corresponding states), 
the explicit representation of all micro — steps allows us to state the adequate 
assumptions and prove them without any distortion of the semantics of the whole 
model. The assumptions used are composed by a number of simple rules whose 
translation into code is straightforward. Nevertheless, they must be obtained 
manually, which can be a difficult task if the interface is complex. So, our present 
work is addressed to attaining these constraints in a more automatic way. 
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We think that the use of modular verification might be essential when the 
model is composed of relatively independent devices. The result in [16] and [9] 
confirm our idea that it is possible to (nearly) interactively perform verifications 
of interesting properties of a system as we describe in [17], thus making model 
checking a powerful tool for detecting bugs and for debugging the specification. 
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Abstract. The paper describes a problem of multi-agent path plan- 
ning in environment with obstacles. Novel approach to multi-agent op- 
timal path planning, using graph representation of environment models 
is described. When planning the path of each robot, the graph model 
of environment is dynamically changed for path correction and collision 
avoidance. New algorithm applies changes of robots’ paths and speeds 
to avoid collisions in multi-agent environment. 



1 Introduction 

Problems of multi-agent robot systems control have got significant importance. 
Each multi-agent robot system has some transport subsystem, which consists of 
several mobile robots. The problem of controlling such mobile robot group can 
be divided into two main parts: 

— Optimal global (general) task decomposition into subtasks, and their optimal 
distribution between separate robots in the group. 

— Path planning, control and movement correction for each mobile robot. 

New approach to path planning and motion programming for mobile robots 
is proposed. The method is based on graph optimization algorithms. Novelty of 
the developed multi-agent path planning algorithm is as follows: 

— All mobile robots are considered as dynamic obstacles. 

— Graph representation of common environment models is used for path plan- 
ning. 

— Each edge of the graph has two weights: distance and motion time (speed). 

— Weights of edges can be modified during path planning. 

— The quickest path is planned (time optimization). 

— Expert rules for speed and path correction are synthesized to provide collision 
avoidance. 

The algorithm is formulated in terms of the optimal find-path problem on 
a graph, where the graph edges are labelled with some values. It is usually 
possible to transform common environment models (e.g. vector or grid model) 
to the corresponding graph representation. Thus, the algorithm can be applied, 
for example, on visibility graphs and grid environment models. 
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2 Background 

The problem of path planning for various types of mobile robots was widely 
investigated by many researchers [5,8,9,10,11]) but almost all of them consider 
the problem of path planning for a single mobile robot. The problem of path 
planning for a group of mobile robots was investigated in [10,11], but the pro- 
posed algorithms did not provide path optimality in any sense. In [6] there was 
introduced an approach to control a group of mobile robots by means of the 
global task decomposition into several subtasks, with non-intersected paths of 
the robots. This is not possible for many practical tasks, like manufacturing, 
traffic control, etc. Therefore, a problem of adaptation of known optimal path 
planning algorithms for multi-agent robot systems exists [1,2]. 

Planning mobile robot motions in a multi-agent robot system has a number 
of peculiarities and some additional difficulties. They are related to necessity of 
taking into account not only possible obstacles (including unknown ones) in a 
working space, but also movement of other robots, while planning the path of 
each agent-robot. It seems logical to divide the problem of path planning and 
control of mobile robots-agents in a working zone into two subproblems. 

— Path planning and optimization for each agent-robot individually, taking 
into account other robots movement. This problem can be solved by modi- 
fying algorithms of path planning and optimization in an environment with 
obstacles. At this stage, full knowledge of the environment is supposed (i.e. 
the environment does not contain unknown obstacles). 

— Unforeseen collisions avoidance and the planned paths correction in case, 
when information about the environment is incomplete, or robot paths de- 
viate from the planned ones. There are two basic alternatives to solve this 
problem. First is to correct the paths by means of various path local cor- 
rection algorithms. Shortage of this approach is non-optimal agent-robot 
motions. The second method is complete or local-optimal path re-planning, 
when new obstacles discovered or collisions occurred. 

It should be mentioned, that cooperation between individual agent-robots is nec- 
essary to solve the path planning and optimization problem. Each agent-robot 
has to share information about its planned path and actual motions with other 
agent-robots. Maintaining the planned paths database and motion coordina- 
tion could be performed by the special agent-supervisor. The agent-supervisor 
maintains information about environment and each agent- robot motions. In- 
formation about environment is collected by agent-robots, equipped with sen- 
sors. The path planning system of each agent-robot can use information from 
the agent-supervisor. In some cases the agent- supervisor plans paths for all the 
agents-robots and transmits the planned paths to them. 

Solution of the second problem would be more reasonable to be assigned 
to local control systems of agents-robots, thus the accident-free realization of 
the robots tasks is ensured even in case of malfunctioning communication of the 
agents. To solve this problem, an agent-robot should have its own sensor system, 
which must be able to provide distinguishing static obstacles and moving robots. 
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Besides, sensor systems of robots can be applied to correction of the environment 
model for more accurate path planning. 

3 Environment Models 

There are a lot of widely known environment models, for example, grid (occupan- 
cy cell), vector (obstacles are represented by polygons), graph (visibility graph, 
Voronoi diagram, etc.) [3] and their modifications. Special types of environment 
model, for example, analytic-predicate, semantic, etc., exist as well [1,12]. 

Each environment model has certain advantages and disadvantages for path 
planning purposes, for instance: 

Grid model is simple to be used, corrected and updated with data, gathered 
by different robots. But it requires high memory expenses, it also has high data 
redundancy and lack of accuracy. Some of these drawbacks can be eliminated by 
using more comprehensive grid model [8]. 

Vector models feature high precision, low memory expenses, but it is difficult 
to plan a path, using this type of models, it is also difficult to update the envi- 
ronment model with data from robots’ sensor systems, since sensor information 
is usually presented in discrete form and, hence, needs to be transformed into 
the vector form. 

Graph models are more suitable for path and motion planning problems. 
As a rule, graph model only consists of possible paths, i.e. information about 
obstacles is excluded during the graph constructing. Grid and Vector models 
can be mapped onto the graph model. There are known various algorithms for 
solving optimization problem on the graph, for example, Dijkstra algorithm, A*, 
D*-algorithm [5], etc. The possible paths in vector environment model can be 
represented by a visibility graph. The visibility graph is a graph, which nodes 
represent vertices of polygonal obstacles, and its edges represent straight possible 
paths, connecting the obstacle vertices, i.e. lines of “visibility” . Once the static 
graph is constructed, target and starting points are added and the visible edges, 
connecting them with other graph nodes, are computed. To plan paths, graph 
model will be further used. Graph of possible paths can be obtained from both 
vector and grid environment models. Moreover, graph of admissible paths can 
be constructed on the base of agent-robots’ experience of motion. Information 
about agent-robots’ motions can be stored separately, or in the graph nodes. 

In summary, the advantages of using a visibility graph, or graph of possible 
paths for motion-planning are in fact that it is a simple, well-understood method 
which yields optimal paths in 2D, or 3D configuration space. 

4 Graph Environment Model 

The graph environment model used for multi-agent optimal path planning is 
described below. Points (places) in the environment and admissible (possible) 
paths between them are represented by the graph, nodes of which represent 




506 Fedor A. Kolushev and Alexander A. Bogdanov 



certain places in the environment and edges represent admissible paths. Each 
edge of the graph has a weight, that is adequate to path length, travel time, or 
dilBculty of traveling, etc. between corresponding nodes. Note that the graph by 
creation only consists of admissible paths. 

Let us consider a graph G{V, E) with M nodes. All nodes are numbered. Each 
node i has Mi > 1 adjacent nodes (vertices) ■ ■ ■ ^Mi - Besides, all graph 

nodes are characterized with a weight Wi. Weight Wi of a node i (i = 1,2, . . . , M) 
corresponds to the value of minimized functional (for example, distance, or mo- 
tion time). To each edge of the graph, connecting nodes i and j, there are 
assigned two characteristics: Sij — distance in space between these two nodes 
and lij — motion time, depending also on motion speed. In summary, any such 
graph possesses the properties, as follows: 

Each Node of graph is characterized by: 

1. Coordinates of a point in the environment space. 

2. Value Wi of functional to be minimized (distance, time, etc.). 

3. Set of adjacent nodes ii, iui ■ 

4. Additional characteristics needed for multi-agent path planning, such as set 
of agent-robots, moving through the node, and the corresponding set of time 
moments. 

Each Edge of graph is characterized by: 

1. Distance Sij between nodes i and j. 

2. Weight of the edge lij corresponds to time of motion from node i to j. This 
value is variable and may be changed while planning the path. 

3. Additional characteristics. For example, the edge may have two different 
weights lij and Iji, that depend on direction of motion between i and j. It 
allows to simulate 3-D environment or bi-directional roads. 

5 Multi-agent Path Planning Algorithm 

Let us introduce some definitions: the shortest path is a path of minimal length, 
the quickest path is a path of minimum motion time. 

Let there is required to find a node sequence, which denotes the shortest path 
from the start point to the target point. Before the path planning, all weights 
lij of the graph edges have to be initialized as follows: 



where V is an average speed of the agent-robot. This is done, assuming that 
robots move along the paths with some average (economy) speed, and to take 
into account possibility of braking and acceleration as well. 

The weights of the graph nodes must be initialized with a maximum possible 
value 00 . The start node must be initialized by the start time value IFo = to- 
According to known edge weights, and using one of optimization algorithms. 
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for example Dijkstra’s algorithm, the shortest path is found then. During path 
planning, weights of nodes change and get equal to the moments of time, at 
which the agent-robot passes through these nodes. Note, that in fact, taking 
into account the above described initialization method, the algorithm finds the 
quickest path, but in case of one path planning, the shortest and the quickest 
paths are the same. 

When planning paths of several robots, let us consider the path of each robot 
not only in environment Cartesian space (as it was done for a single robot path 
planning). Let us plan the path in the time-space continuum in order to take 
into account other robots movement. Such approach allows to avoid collision 
of separate agent-robots, simultaneously moving in the time-space continuum. 
Hence, paths are planned not in 2D planar environment, but in 3D time-space 
environment, taking into account movement of all other agents-robots. Let us 
note here, that if one-agent path planning is performed in 3D Cartesian environ- 
ment, the multi-agent path planning is performed in 4D space — with concern 
of time (schedule of robots movement). The described below algorithm uses this 
approach and plans agents-robots paths sequentially (path by path), and when 
planning the next robot path, all already planned paths are taken into account 
to eliminate collisions. 

According to the described approach, the main differences of the developed 
multi-agent optimal path planning algorithm from the one-agent one are as fol- 
lows: 

— to each graph node i (z = 1, 2, . . . , M) there is assigned not only its weight 
Wi, but the node additionally stores two sets: moments of time, when other 
agents-robots move through this node i (let tji is a time when robot j passes 
through node i), and IDs of these agents-robots as well. 

— The graph (in particular, weights of the edges) can be changed, when plan- 
ning a path of each robot to avoid collisions. 

For multi-agent path planning the one-agent path planning algorithm must 
be supplied with a number of expert rules, which provide collision-free planning. 
Collision avoidance is performed by means of the graph correction — changing 
edge weights. This results either in path correction (a robot is forbidden to 
move on the edge, occupied by another robot), or change of robot’s speed (robot 
is forced to move faster, or slower on some edges in order to free up the way 
for others, the paths of which are planned earlier and, hence, already known). 
Besides, if D*-algorithm is used as a basic path optimization algorithm, the 
distance between two nodes can be changed. Changing the distance corresponds 
to environment model correction. 

Initialization of graph node weights Wi (z = 1, 2, . . . , M) is the same as in the 
one-agent path planning algorithm, and it is performed before each robot path 
planning. When planning a path for any robot, graph node weights are changed 
just as in the one-agent path planning algorithm: 

jW, + lu,,if{W, + lu,)<Wi^ 



(2) 
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The only difference is that this rule is supplied with expert rules of avoiding 
collisions in the graph nodes, which correspond to crossroads, and expert rules 
of avoiding collisions on the graph edges, which correspond to the straight roads 
(we assume one-way simultaneous movement, i.e. no two robots can move simul- 
taneously on the same graph edge in different directions). 

5.1 The Expert Rules 

1. Avoiding collisions in the graph nodes (crossroads) 

if Wi + la- = tkij , (/c = 1, 2, . . . , n), where n is a number of robots, 
then la- = la- -\- e, where e is a value, that defines minimum time interval 
between different robots passing the same crossroads. This value must pro- 
vide safe crossroads passage. Hence, it depends on robot sizes and speeds. 
Weight Wi- of node ij is computed then according to formulae (2). This 
means the increase of time of the robot motion on the graph edge from node 
i to ij by e time units, and corresponds to the robot speed change. The 
speed is piece-wise constant on the path, and is computed for each edge, 
connecting nodes i and j as 



V- = ^ 

7 

hj 



( 3 ) 



2. Avoiding collisions on the graph edges (straight roads) 
if {Wi < tki) A [{Wi + lii.) < tfeij, {k = l,2,...,n), then 
if tki > tki-, then this is a case, when two robots will move in opposite 
directions, and the robot, which path is being planned, will pass through the 
edge before robot k. No collision happens, hence, change of the edge weight 
is not necessary. The weight of the next node Wi- is computed as (2). 



else if {tki < tki. ) A 



tki ^ 



^ tki 



, then collision is pos- 



tlei ■ tlc'i — la - 

Klj Kl llj 

sible: robot k will follow the robot, which path is being planned, and hit it 
on the edge. To avoid collision, it is necessary to change the edge weight for 
the current robot (i.e. to change the motion time by increasing speed): 



^ii-i — 



{Wi tkij £){tki 



tki 



tki tkij ^ 



( 4 ) 



Then the node weight Wi- is computed according to (2). 
else robot k will follow the robot, which path is being planned, but its speed 
is insufficient to hit the currently computed robot on the edge. Then the edge 
weight is not to be changed, and the node weight Wi- is computed as (2) 
if {Wi > tki) A [{Wi + lii-) > tfeij, {k = l,2,...,n), then 

> tki-, then this is a case, when two robots move in opposite directions, 
and robot k will pass through the edge earlier, than the robot, path of which 
is being planned, drives onto the edge. There is no need in this case to change 
the edge weight, since no collision is to occur. Then the node weight Wi^ is 
computed as (2). 
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else if {tki < tkij ) A 



^ki ^ 






^ tki 



then collision is to 



tui . — tlei l-H - 

Klj Kl llj 

happen: the robot, which path is being planned, will follow robot k on the 
edge, and hit it due to high speed. To avoid the collision, it is necessary to 
decrease speed of the robot, which is being computed, i.e. to increase its time 
of motion on this edge according to (4), then the node weight is determined 
as (2). 

else the robot, which path is being planned has insufficient speed to catch 
and hit robot k before the crossroads. The node weight Wi . then is computed 
as (2). 



if {Wi < tki) A [{Wi + lii-) > {k = l,2,...,n), then 
if tki < tkij , then collision is possible: robot k will follow and hit the robot, 
which path is being planned, before the crossroads. To avoid the collision, 
the speed of the current robot obviously should be increased. For this to be 
achieved, the edge weight is to be changed according to (4), then the node 
weight is computed as (2). 

else if tki > tki-, then collision can not be avoided: robot k will have been 
moving on the edge in the opposite direction, when the robot, which path is 
being planned, drives onto the edge. To avoid collision, the motion through 
the edge from node i to node ij must be forbidden for the current robot. To 
reach this goal, let us change the weight of the edge as follows: 



la- = 00 . (5) 

Then the node weight Wi- is computed according to (2). Let us note, that 
at further path constructing this edge will not be included into the path due 
to its infinite weight. Therefore this type of collisions is also avoided, 
if {Wi > tki) A [{Wi + In-) < {k = 1,2, ...,n), then tki < tki^ and 

the collision is possible (it is the only possible case, since la- > 0 ): the 
robot, which path is being planned, will follow and hit robot k before the 
crossroads. To avoid the collision, it is necessary to change the edge weight 
according to (4), and then the node weight is computed as (2). 



6 Summary 

Using graph of possible paths makes developed algorithms of robot path planning 
abstract to environment model, thus improving their application capacity. 

These algorithms provide global optimality while path planning according to 
various given optimum criteria: least motion time, least path length, etc. 

Multi-agent path planning algorithm also provides robots collision avoidance. 
That algorithm automatically plans safe robot paths, which do not intersect 
each other in time-space continuum. Simulation results approve effectiveness of 
synthesized algorithms. 

Finally, let us note that the described multi- agent algorithm implies sequen- 
tial path planning for each of robots (path by path), and when planning the 
next robot path, all already planned paths are taken into account to eliminate 
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collisions. Therefore, path of the first robot in the sequence is planned with the 
one-agent algorithm, path of the second robot is planned with concern of the 
first robot’s path, when planning path of the third robot, paths of first two are 
taken into account, etc. And the described algorithm provides optimality of all 
planned paths. It means, that currently planned path is optimal of all possible 
at this stage. However, the paths (and, hence, their lengths and motion times) 
depend on the order of planning, i.e. there is a question, which robot path should 
be planned first, which is to be second, etc. This problem is not significant, if 
relation of possible paths quantity on the graph to the number of robots is big 
enough. But if the described expert rules correct the graph (edge weights) too 
frequently while path planning, then the choice of the right sequence of robots 
for path planning may have significant influence on the general robot team per- 
formance. This problem is still open, and it is a question of a separate research 
to investigate it. 
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Abstract. The paper describes an experimental system for understand- 
ing short texts from a limited problem domain (weather forecast tele- 
grams written in Russian). A semantics-oriented and text type spe- 
cihc approach to analysis is proposed which gives preference to lexical- 
semantic and topical coherence mechanisms in their relation to the do- 
main structure. The system is implemented with both classical means for 
knowledge representation and processing and methods of object-oriented 
and agent-based technique. 



1 Introduction 

The paper describes an experimental system for understanding real short texts 
from a limited problem domain, cf. previous work in [1,2]. The goal of the analysis 
is to explicate the informational content of the input text by a semantic network 
(tree), which is used as a basic knowledge representation language suitable for 
further transforming to represent information in any other terms. The choice of 
formal means and the underlying linguistic approach are based upon the follow- 
ing principles: a) the understanding system is both the domain and genre (text 
type) specific; b) the analysis procedure is semantics-oriented; c) information of 
different linguistic levels (lexical, syntactic, semantic, pragmatic) is processed 
simultaneously due to the object-oriented paradigm using class hierarchy with 
multiple inheritance; d) special means to represent linguistic indeterminate units 
are utilized; e) the declarative descriptions with a system of agents provide a local 
bottom-up parsing procedure. 

The presented experimental system is implemented with the help of the soft- 
ware environment SemP- A that is an advanced version of SemP-TAO system [3] . 
SemP-A is based on an integrated knowledge representation model which com- 
bines both classical means for knowledge representation and processing (such as 
frames, semantic networks with binary relations etc.) and methods of object- 
oriented and constraint programming. Important features of the environment 
are the ability to operate with objects, that can have attributes with imprecisely 
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defined values, and the utilization of the agent-based technique as a main means 
for definition of logical inference and data processing. 

Each agent reacts only to related events (e.g. appearance of new objects 
of certain class or changing values of their attributes or setting new relations 
between objects). Actuation of the agent can lead to creating new objects or 
changing state of the existing ones. This, in turn, causes activation of other agents 
associated with the new or modified objects and so on. Unlike the production 
systems that use an expensive pattern-matching routine, the activation of agents 
is based on the associative event-driven mechanism that significantly increases 
efficiency of the inference and control processes. 



2 Text Corpus and Problem Domain 

The texts under consideration are weather forecast telegrams sent by local fore- 
casters to the central meteorological offices [M-texts). An example of M-text is 
given below in literal translation from Russian: 

weather tomsk region 19/08/98= 

variable eloudiness in morning local fogs over south parts locally small 

short rains thunderstorms wind south south-west 7-12 m/s temp at night 

8-13 at day time 8-23 tomsk night 10-12 day 21-23= 

An M-text contains a sequence of prognostic statements with parametric 
semantics (an “object — parameter — value” scheme). The estimations are 
given in terms of parametric Features grouped around meteorological Elements 
(Precipitation, Cloudiness, WeatherPhenomena, Wind, Temperature, Inflama- 
bility) within topically coherent text fragments. Each topical fragment contains 
a sequence of estimations for the same Element. The correspondence between 
Elements and their Features is represented by the Element-Feature relation, the 
third argument of the relation presenting basic parameters of the Element: 
Element- Feature ( Element: “Wind”, Features: { “WindDireetion” , “WindVariation” , 
“WindSpeed”, “WindGust”} , DefaultFeatures: “WindDireetion” , “WindSpeed” ) . 

Estimations are time- and site-specific, i.e. they are made with respect to 
certain Temporal and Locative objects. The territory and the date mentioned 
in the heading part of the text are basic Loc and Temp objects of the domain. 
The objects of estimation in elementary statements are related to the basic Loc 
and Temp objects as their parts: e.g. LocValue local and Temp Value in morning 
in the fragment in morning local fogs. Circumstantial Values may be implicit in 
the fragment and are in this case recovered from the previous context: e.g. over 
south parts locally small short rains \ thunderstorms. 

The output semantic representation of the topical fragment wind south south- 
west 7-12 m/s from the example above is given in section 5, Fig. 4. 
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3 Approach to Text Understanding 

Our approach to text understanding takes into account not only the domain 
structure but the text pragmatics as well. The telegram genre causes main pe- 
culiarities of the text corpus. Texts are extremely concise — they are written in 
“telegraphic style”. On the one hand, the semantic units (Elements, Features, 
Loc and Temp Values) are reduced as they can be easily recovered due to the 
strong semantic and topical coherence and regular word order. On the other 
hand, grammatical and syntactic elements are regularly omitted (lack of prepo- 
sitions, conjunctions, or even inflexions). Means of text segmentation are absent 
(there are no punctuation marks and capital letters). Abbreviations are widely 
used. Texts bear a lot of mistakes as a result of their spontaneous production. 

Previously, our experiments in different problem domains [1] involved local 
morphological and syntactic processing. The specificity of the M-text corpus 
results in a strong semantic bias of our approach to analysis. According to it, 
lexical semantics of words and word collocations is defined in terms of “orienta- 
tions” as pointers to the domain system of concepts. The semantic orientation 
indicates a set of Features that can be represented by a lexeme on the surface 
level. The topic orientation relates a lexeme to the set of Elements whose descrip- 
tion admits this lexeme. For example, the vocabulary unit variable is the Value 
of ” CloudAmount” or ” WindDirection” Features and topically corresponds to 
’’Cloudiness” or ’’Wind” Elements. This information is stored in the slots Ori- 
entation and TopicOrientation of the vocabulary entry of the lexeme. 

The semantics-oriented approach admits processing syntactic non-regularities 
resulting in proper output semantic structures. Several types of semantic units 
(features, values, locations, etc.) that appear in text fragment under analysis 
are combined into topical and semantic structures using orientations and word 
order information. Topical mechanisms provide the recovery of reduced semantic 
objects. 

4 Class Hierarchy 

Fig. 1 shows a part of our class hierarchy and illustrates interaction between 
lexical units and concepts of the problem domain. The hierarchy takes into ac- 
count the peculiarities of the text corpus: it lacks classic grammar classes (no 
verbs, nouns, etc. and no morphological characteristics). The base class Object 
has the only slot State with two possible values: “working” means that object is 
to be processed and “worked_out” means that the object is no longer subject to 
any further processing. 

The lexical hierarchy includes classes for words, numbers and signs. The 
base class LexObject contains common lexical information for the vocabulary 
look-up. A chain of LexObjects is produced by a special LexSequence relation. 

The text hierarchy is also reduced, as there are no paragraphs, sentences 
and clauses. The only text- structure class is Topic used in the process of decom- 
position of the input chain into the sequence of topically coherent fragments. 
SemWord objects are related to Topic by a special Topical relation. 
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Fig. 1. Class hierarchy 



The semantic hierarchy corresponds to lexical level of the domain con- 
cepts. SemObjects are characterized with orientation slots. SemWords are ele- 
ments of the future semantic tree bound with SemTree relations. The auxiliary 
words (AuxWord) are opposed to the SemWord class as they serve to modify 
meanings or even to refine classes of SemWords but are never present in the 
resulting structure. 



5 Agents and Analysis Procedure 

The analysis procedure is performed by a set of agents, which may be classified 
according to their functions in the process. 

The first group of agents interacts with the input chain to execute the lexical 
processing. Agents react to the current portion of the chain, delimit and create 
LexObject nodes, refine their classes (LexSigns, LexN umbers and Lex Words), fill 
their slots and insert them into the network. A special agent performs the vo- 
cabulary look-up for class and slot values information. The LexSequence relation 
joins the node being inserted into the network to the previous one. 

The agents of the second group perform the presemantic processing. They 
react to the appearance of instances of Number or AuxWord classes. The Aux- 
Word subclasses require different types of processing and their contribution may 
be different. For example. Preposition with locative orientation serves to disam- 
biguate words like west and to refine its class as a Loc Value [over west regions 
vs. wind west). Fig. 2 presents the results of lexical and presemantic analysis 
for the fragment wind south south-west 7-12 m/s. The collocations have been 
assembled. Number orientation specified, interval composed and refined as Nu- 
meric Value. 

The third group of agents realizes the topical analysis. Agents simulate 
the left-to-right “reading” of the lexical chain, interact with nodes of the Sem- 
Word class , the AuxWord nodes being simply passed by in case their states 
are “worked_out” (otherwise the topical processing stops and waits until the 
nodes are presemantically processed). The topical relation is created between 
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Fig. 2. Lexical and presemantic processing 



the ’’working” Topic and a SemWord node provided that their TopicOrienta- 
tion values conform. The topical shift (creation of a new Topic node) may be 
provoked by a new SemWord if its TopicOrientation does not agree with that 
of the ’’working” Topic, e.g. thunderstorms \ wind south south-west 7-12 m/s. 
Circumstantial Sem Words provoke a subtopic hypothesis (text fragment with 
a more precise description of the same Element) that may be later rejected. 
Fig. 3 demonstrates the results of topical analysis for our example: the new 
Topic node has been generated on meeting the Element node and further bound 
with SemWord nodes by Topical relations. 



Topical 



-.D-: 



Topic 

■‘Wind” 






element Value 

“Wind” “Southern’ 



Value 

“Southwestern” 



NumericValue 

“7 12 m/sec” 



Fig. 3. Topical analysis 



The fourth group of agents performs the semantic analysis, which involves 
three types of actions. Several agents deal with specification of semantic orien- 
tations of Values and Features. Special agents react to situations of semantic 
reduction in order to recover missing units. A few agents are intended to realize 
the bottom-up process of constructing the semantic tree by finding out the se- 
mantically dominant counterpart for any SemWord node and creating SemTree 
relation between them. All the semantic agents are able to work under the condi- 
tion that Topic related to the SemWord nodes under analysis is ”worked_out” , i.e. 
the topical fragment construction is completed. Consider our example wind south 
south-west 7-12 m/s and its resulting semantic structure presented in Fig. 4. The 
indeterminate Values south, south-west (” WindDirection” vs. ” WindVariation”) 
and 7-12 m/s (’’WindSpeed” vs. ’’WindGust”) have been disambiguated. The 
basic Features (’’WindDirection” and ’’WindSpeed”) have been recovered due 
to the DefaultFeatures information of Element-Feature relation and the corre- 
sponding SemTree relations set up. Of the two competing Values to be attached 
to the recovered ’’WindDirection” Feature node the first one has been chosen 
by a special condition on the word order. Note that semantic trees of all the 
topical fragments of the text will be further connected to the basic Locative and 
Temporal units immediately or via their local circumstantial units (if any). 
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Fig. 4. Semantic analysis 



It is necessary to emphasize that all agents work simultaneously. While lexical 
agents are processing the input chain and creating lexical nodes, topical agents 
are assembling them in coherent text fragments. The progress of topical analysis 
is being provided by presemantic agents that are creating the required conditions 
in the lexical chain. At the same time the completely analyzed topical fragments 
are subject to semantic processing. 

6 Conclusion 

Several questions of M-texts processing have been left out of the scope of this 
paper. Nevertheless, we hope that we have managed to demonstrate basic prin- 
ciples of our approach including semantics orientation, text type consideration 
and processing different types of information in parallel. The use of agent-based 
technique allows increasing efficiency of data processing control in comparison 
to production systems. This is achieved by using the associative event-driven 
mechanism instead of an expensive pattern matching routine. 

Meteorological telegrams, with their text specificity and lucidity of struc- 
ture of underlying problem domain, appeared to be a good testing ground for 
experiments and development of the agents mechanism. The most interesting 
perspective seems to be the analysis of abbreviations and mistakes. Disambigua- 
tion of deviating lexical units implies local multivariant processing that can be 
efficiently realized within the framework of event-driven approach. 
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Abstract. We consider an approach to the development of a speech con- 
trol system for a robot. The robot is working in an environment contain- 
ing several rooms; it can perform user commands and answer questions 
of the following types: Where are you? or What do you see in the room? 
The system includes the following components: speech input subsystem, 
linguistic processor to translate English commands into a formal repre- 
sentation, the robot (simulated by a program) and a speech synthesizer 
to voice the robot’s messages. The speech input and output subsystems 
are based on standard commercially available software packages. The 
linguistic processor and robot simulator are implemented with the help 
of two original instrumental systems - Lingua-F and SemP-TAO. An 
outline of the Lingua- Voice project is also given. 



Introduction 

Although the problem of controlling technical devices by means of speech is not 
new, it is still important. It has become of particular importance recently, as 
speech recognition systems have become available. 

Modern projects have demonstrated a trend to use natural language (NL) 
in all aspects of interaction with the robot. At the specification stage, NL is 
used to state instructions to the robot or a qualitative description of the desired 
situation, while during execution of a command the robot produces detailed 
messages about its current actions. As stated in [1], the main advantage of using 
the natural language for robot control is its ability to express information with 
varying degree of detail and at different abstraction levels, which is difficult to 
achieve with a formal language. 

One of the first programs understanding natural language was the famous 
system of Winograd [2] . Another well-known system, SHAKEY [3] , was a mo- 
bile robot without a manipulator; it could understand simple natural-language 
commands. The paper [4] proposed a system to control a remote robot with the 
help of a limited vocabulary of words in a natural language. 

The purpose of project KANTRA [1,5] is to create a system for speech com- 
munication with an autonomous mobile robot that has two manipulators and is 
designed to perform complex assembly work. 



D. Bj0rner, M. Broy, A. Zamulin (Eds.): PSI’99, LNCS 1755, pp. 517—529, 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 




518 George B. Cheblakov et al. 



An approach to the development of an NL interface for a system controlling 
a mobile service robot working in a room was examined in [6]. Another similar 
system [7] includes a well-developed NL interface that enables the human opera- 
tor to use NL to deseribe scenes (e.g., rooms in a building, objects in the rooms, 
spatial relationships between objeets, etc.) as well as eommands and scenarios 
of robot’s actions in the environment (e.g., go to a room, carry an object from 
one place to another, clean the room). 

The Russian Research Institute of Artificial Intelligence (RRIAI, Moscow- 
Novosibirsk) and the Institute of Informatics Systems (Novosibirsk), together 
with the Institute of Applied Knowledge Processing Systems (FAW, Ulm), are 
working on a speech control system for an intelligent robot. 

The robot controlled by the system is working in a building containing several 
rooms. It executes user commands expressed in English, e.g.. Go to room 5 or 
Transfer the computer from the first room to the second room. In addition, it can 
answer some questions, e.g.. Where are you? or What is located in the room? 

This paper presents the architecture and scheme of operation of a system 
for speech control of an intelligent robot. We give a detailed description of the 
world in which the robot is working, the robot’s abilities and the control lan- 
guage. Implementation characteristics of the main components of the system are 
presented; the paper contains numerous examples. 

1 Architecture and Operation of the System 

The problem of robot control with spoken natural- language commands is divided 
into the following subtasks: 

— speech recognition; 

— translation of command text into a formal representation; 

— execution of the command and the corresponding modification of the robot’s 
world; 

— visualization of command execution results; 

— generation of the robot’s response and its transformation into a voice mes- 
sage. 

It is important also to ensure a closed loop in the operation of the speech 
control system: reception of a command, its analysis, execution, and return to 
the reception of the next command. 

To perform these functions, the system includes the following components: 

— a speech entry subsystem containing a microphone, a sound card, and a 
software speech recognizer; 

— a linguistic processor that receives the text of command in a natural lan- 
guage (English) from the speech recognizer and translates it into a formal 
representation; 

— a command execution subsystem (the robot simulator); 

— a speech synthesizer that voices the robot’s messages. 
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The speech entry subsystem is based on a commercially available package, 
Via Voice by IBM, which produced quite satisfactory results after some necessary 
adjustment. Speech synthesis uses a standard software component, Microsoft 
Concatenated Text-to-speech Engine. 

The linguistic processor is constructed with the Lingua-F [8] instrumental 
system that uses a semantics-oriented approach to the analysis of NL texts that 
was proposed by A. S. Narin’yani [9]. 

The subsystem of command execution and the environment emulating the 
robot’s world were implemented with the help of SemP-TAO [10]. SemP-TAO 
is an integrated software environment for knowledge representation and pro- 
cessing that was developed for the construction of intelligent systems requiring 
description of subject domains with complex structure and semantics, as well as 
a combination of logical inference and calculations over imprecise values. 

The functional overview of the system in Fig. 1 demonstrates the complete 
cycle of execution of a command given to the robot. 






Fig. 1. The functional overview of the system 



A command pronounced by the operator is transmitted to the voice recogni- 
tion system that transforms a phonetic representation into the textual sentence. 
Then, this text is processed by a linguistic processor that translates the sentence 
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into certain sequence of formal commands. For interpretation, formal commands 
are transported to the simulator - a subsystem which simulates the robot’s 
behavior. According to the commands, the simulator performs all prescribed ac- 
tions which generally results in a transformation of the simulated environment - 
the world of robot. Such transformations are visualized on the screen of com- 
puter by special program that enables an operator to check robot actions and 
states of its environment. If a command assumes certain explicit answer, then a 
subsystem of command interpretation generates an appropriate text; this text is 
then transformed into speech form and is pronounced. 

When processing of a command is completed, the operator can input the 
next command. 



2 Intelligent Robot and Its Control Language 

In this section we consider the robot’s world, the robot’s features and abilities 
as well as the robot control language. 



2.1 Robot’s World and Robot’s Abilities 

The robot’s world consists of several rooms which may contain some objects. 
The classes of objects such as furniture and equipment are distinguished. 

Relations are used to define position of objects with respect to each other. 
Examples of such relations are: to the left, to the right, below, above, inside, at 
the eenter, etc. 

In this model of the world the robot is both an object and a subject. As an 
object, it has the properties of equipment. As a subject, it should be able to move 
furniture and equipment from one room to another and to answer questions of 
the following types: Where are you? What do you see there? Is there a table?. 

Therefore, the main functions performed by the robot are following: find, 
take, put, move, go, etc. 

These functions correspond to a set of operators. The set of operators is 
divided into two levels. The first level is constituted by the operators accessible 
to the user. These are used to state the instructions for the robot. The second 
level is constituted by the operators that are used to implement operators of the 
first level. 

The first-level operators are Bring, Take away, Move, Go To, Where Is, What 
Is In. The second-level operators consists of the following operators: Find, Take, 
Free, Put, Say. 

There is a separate operator scheme for each operator; it determines the 
conditions, the order (plan) and the results of execution of the operator. In 
contrast to the systems STRIPS and ABSTRIPS [11,12] that use linear operator 
schemes, in our system recursive operator schemes are utilized. 

For example, the following scheme corresponds to the second-level operator 
Find: 
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Find ($this) { 

if locations of $this and the robot are identical, then 
save current location of the robot in variable $location; 
return $location as a result; 
else 

mark current room as already examined; 

if the next room has not been examined yet, then 

save location of the next room in variable $new_location; 
Go to ($new_location) ; 

Find ($this) ; 
else 

Say ($this, "not found"); 

} 



Now we describe the scheme for the first-level operator Bring: 

Bring ($this){ 

save current location of the robot in variable $here; 
$location:=Find ($this) ; 

Move ($this, $location, $here) ; 



2.2 Formal Language for Robot Control 

Formal language for robot control is developed on the base of the operator 
schemes described above. It was called FOROL (the FOrmal RObot Language). 

This language includes the operators Wherels, GO, and MOVE. Arguments 
of these operators may be objects and rooms. 

The description of an object has the following form: 

OBJECT(name: name-of-object, color: color-of-object), 
here namc-of-object is the name of the object, and color_of-object is the color 
(may be not given). 

The description of a room in FOROL has the following form: 
ROOM(number: number-of-room), 

here number_ofjroom is an integer which denotes the number of room. 

We now give a description of the syntax and semantics of the operators of 
the language. 

The operators Wherels has the following form: 

Wherels(what: object, where: room). 

Here object is the description of an object whose location must be determined 
or confirmed, and room describes a room. 

Note that one or both arguments in the operator Wherels can be omitted. 
The semantics of the operator Wherels depends on which arguments are given, 
and which are omitted. Consider each case separately. 

If only first argument is given, then execution of the operator results in 
searching for an object with given characteristics and issuing a written or spo- 
ken message on its location. If several objects satisfy the description, then the 
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information on the first object is output. In case of failure, the corresponding 
message is issued. 

If only second argument is given, then the characteristics of all objects found 
in room are output. 

If both parameters are omitted, then the characteristics of all objects found 
in the current room are output. 

The operators GO has the following form: 

GO(to: room), 

here room describes a room to which the robot should go. 

In according to this operator the robot must go to the specified room. If this 
is impossible, the corresponding message is output. 

The operators MOVE has the following form: 

MOVE( what: thing, from: rooml, to: room2) 

Here thing is the description of the object that should be moved from rooml to 
room2. The values of the characteristics of the object are identical to those in 
the operator Wherels, with the exception that Robot should not be used for the 
name of the object. 

We note that only the first argument of the operator MOVE is required 
always. All the other arguments are optional. The semantics of the operator 
MOVE as well as Wherels depends on which arguments are given. 

So, if all of three arguments are given, then the object thing must be moved 
from rooml to room2. 

If only two arguments what and from are given, then the object thing should 
be moved from rooml to the room where the robot is. 

If two arguments what and to are given, then thing should be found and 
moved to room2. 

Finally, if only argument what is given, then the object thing should be found 
and moved to the room where the robot is situated. 

In all versions of MOVE, a suitable message is issued in the case of failure. 
For example: Green chair is not found, Room 200 does not exist, Gomputer is 
already in room 5, etc. 

Note that the FOROL language includes a small set of operators, but due to 
a great power of the operators, this set suffices to describe all the tasks which 
should be performed by the robot. 

3 Linguistic Processor 

The linguistic processor (LP) was constructed with the help of a current modifi- 
cation of the Lingua-F software environment that was developed in the 80’s [8]. 
Lingua-F supports construction of an LP that translates the text of an NL com- 
munication into a formal representation using the FOROL language. Lingua-F 
supports all stages of LP construction: 

— forming a vocabulary; 

— writing production rules for lexical and base analysis of the input text and 

rules for generation of the output representation; 
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— compilation of the rules and the vocabulary; 

— debugging and testing of the linguistie processor on a comprehensive data 

bank of various NL messages to the robot. 

Lingua-F has a facility for saving a stand-alone LP that can exist on its own 
and ean be used in other software packages. 

The LP thus construeted is included in the speech control system of the 
robot as a component. A natural-language text is placed at the input of the 
LP, and the corresponding formal representation is generated at the input. The 
transformation of the text into a formal form is based on a semantics-oriented 
approach that enables one to analyse the input text based on the semantics and 
pragmatics of the subject domain in which the communication with the robot 
oeeurs. 

The linguistic processor consists of two components: the vocabulary con- 
taining the lexicon of NL requests to the robot and the production component. 
Consider the two components in more detail. 

3.1 The Vocabulary and Types of NL Requests 

In the current version of the system an operator uses two types of NL requests to 
the robot: a directive (command) and an inquiry (question). At the semantics- 
oriented approach, the words that are ineluded in the requests are subdivided 
into significant words which are reflected in a formal representation, and insignif- 
ieant ones ignored at an analysis. 

We distinguish the following types of significant words used when addressing 
to the robot: 

— verbs which define moving of objects, e.g., bring, transfer, move] 

— verbs which initiate movement of robot, e.g. go; 

— verbs and interrogative words and collocations which define search of an 

object, e.g. where, find, search, what room] 

— objeets, e.g., chair, box, table, computer] 

— numerals which can be used in requests, e.g., one, first] 

— adjectives which define colours, e.g., red, white, brown, green] 

— loeations, e.g., room] 

— prepositions, e.g., from, to] 

In addition, requests can include insignificant words, like: number, a, an, the, 
situated. 

Using the above mentioned words one can compose directives: Go to ..., Move 
something from ... to ... and inquiries: Where or What room is located ..., ete. An 
order and a number of eomponents of a request as well as word order within eaeh 
component is generally not fixed. The word order is defined by a grammar of the 
particular natural language. The rules of analysis and synthesis are construeted 
so that to minimise a feeling of language restrictions for an operator. 

We give below several examples of NL requests with corresponding formal 
representations. These examples demonstrate some degrees of a lingual flexibility, 
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one of which in particular is a defining of a room number. Having either digital or 
literal spelling, a room number can be defined by both quantitative and ordinal 
numeral and, accordingly, located in a postposition or preposition to the word 
room. 

First, we consider the directives that are divided into types MOVE and GO\ 

a) In a directive of the type MOVE: Transfer the blue armchair from the first 
room to room number f! a transposition of the locative components is admitted: 
...to room number f from the first room. In addition, a similar command will be 
analysed correctly when formulated with an ellipsis: Transfer the blue armehair 
from the first room to f! In all cases the directive will be translated into: 

MOVE(what: THING (name: armchair, color: blue), 

from: ROOM(number: 1), to: ROOM(number: 4 ))] 

b) A directive like Go to the second room! has completely transparent trans- 
lation: 

GO (to: ROOM (number: 2)). 

The system distinguishes questions that meet the user informational needs of 
both the location of various objects and the presence of objects in the specified 
place: 

a) A question Where is the robot? can also be formulated as to a partner in 
communication: Where are you? Its formal representation is: 

Wherels(what: OBJEGT(name: robot)); 

b) In addition to a question on the robot it is possible to ask about any object 
Where is the red box? or What place is the red box located in? The directive Find 
/ Search the red box! is interpreted as an indirect question on the location of the 
object: 

Wherels(what: OBJEGT(name: box, color: red)); 

c) Questions on the presence of any objects in the room where the robot is 
What is (located) here / there / in this room? are translated into: 

Wherels(what: ?, where: ?); 

d) Questions intended to detect any objects in the specified place Is something 
in room 5?, What is located / situated in room 5? are formally represented as: 

Wherels(what: ?, where: ROOM (number: 5)); 

e) Alternative question Is computer in the room number 2? corresponds to: 

Wherels(what: OBJEGT(name: computer), where: ROOM(number: 2)). 



3.2 Production Component 

The production component of the linguistic processor translates the incoming 
NL phrase in several steps: lexical analysis, base analysis, and generation. 

The rules of lexical analysis divide the entry string into lexical tokens which, 
after accessing the dictionary, are replaced by the corresponding dictionary en- 
tries. Multiple components that are elements of a composite entry are combined 
into a single component. Such a composite entry often serves to resolve ambigu- 
ities. Defining a usage context of a word, i.e., creating a composite entry, makes 
it possible to link several meanings to a single word. 
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For instance, consider the word room in several contexts: a) room 1 (or the 
first room), b) in what room. In the first case, the word room is a locating 
component, while in the second it denotes a question of type Wherels. Creation 
of the composite entry what room, synonymous with the word where, ensures 
correct parsing. 

In the base analysis stage, the parsing tree reflecting the predicate-actant 
structure of the phrase is constructed. First, we construct the second actant, 
which is the object group consisting of a noun (the object) and an adjective, e.g., 
green armchair. Next, we construct actants of two types, from and to, which are 
the locative components represented by nouns with prepositions, e.g., from room 
number two. Finally, the predicate is concatenated with the second actant (e.g., 
bring is concatenated with green armchair) and all locative groups, if any. If the 
parsing succeeds, the whole phrase is reduced to a single component. 

The generation rules transform the tree representation of the phrase resulting 
from the base analysis into the output representation in FOROL. 

4 The Lingua- Voice System: Towards a Cooperative 
Processor for Spoken Language Understanding 

In this section, we shortly summarize the presented results and outline our next 
project related to a voice recognition field. 

The speech control system described in the paper has been fully implemented 
and tested, demonstrating stable operation in a large number of tests. 

The integrated object-oriented environment SemP-TAO enabled us to repre- 
sent the robot’s world in a natural manner, specify and implement an extensible 
formal language for robot control, support visualization of the states of the world, 
and provide a convenient user interface. 

It should be noted that the system is not just a prototype version of the 
speech control system that will be connected to the real device. The integrated 
model of the robot is a good base for experiments and extensions directed at the 
study of a broad range of knowledge representation and processing problems. 
The FOROL language, for example, served as a base for more powerful robot 
control language, including additional tools to work with spatial relationships 
and advanced facilities for description of rooms and objects. Implementation of 
this language will make it possible to work on the development of a robot control 
system using both formal communication means and a richer natural language. 

The system described in this paper can also be used as a solid testing ground 
for research of the use of a spoken language for communication with a wide 
spectrum of applications. In this respect, it has given a rise to a new project 
called Lingua- Voice which we outline below. 

The idea of the Lingua- Voice project is to technologically fulfill a gap between 
an output of a standard voice recognition system and an input of an application. 

Today, industrial speech processors produce rather raw output which, in the 
best case, can include a simple post-processing based on a user-defined context- 
free grammar. In fact, a voice recognition system itself supports only a small 
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part of job needed to provide a really comprehensive communication with appli- 
cations. In particular, voice processing systems presented today in the software 
market are responsible for selection (from a dozen of phonetic hypotheses) of 
’’the most probable” word, taking into account some universal phonetic and 
statistic data, not the information related somehow to the world of application 
or to linguistics. 

We are certain that the approach described in [8] for automatic text pro- 
cessing based on orientation to a restricted subject domain and simultaneous 
processing of many variants is especially adequate for spoken language (SL) 
understanding. 

Our new project called Lingua- Voice concerns the following principles: 

— Multi-variant processing, 

— Automated specification and adjustment of a SL-processor to application, 

— Specialized agents for SL-processing, 

— Closer integration of voice and linguistic processing. 

This development leads us to a construction of a software architecture and 
environment which are shortly characterized below. Their general structure is 
presented at Fig. 2 (where ’’Voice recognizer” and ’’Linguistic processor” func- 
tionally correspond to similar components shown at Fig. 1). 

The Lingua- Voice system is based on a version of the Lingua-F support en- 
vironment which has recently been implemented by M. Zhigalov and D. Shishkin. 

The Lingua- Voice system starts with initiation of a certain voice recognition 
engine: Via Voice, Dragon or whatever. After the engine has completed its work, 
the whole set of phonetic variants is ’’extracted” from its inner memory and 
transmitted to further processing modules. This data has the following struc- 
ture: 

W = <wi = = {hs,i; hs,2; •••; > 

where Wi is a cluster of words detected for the i-th potential word. (This picture 
is obviously simplified for continuous speech.) We call clusters subdefinite words 
and W-structures subdefinite phrases to note a relationship between the issues 
considered here and classical works of A.Narin’ani on processing of non-complete 
information. 

In the general case, the purpose of a concrete Lingua- Voice processor is to 
organize the application of specialized processing agents to such a vector W. If 
no correct variants are found, the process is considered to be failed; if a unique 
variant is found, it is passed over to the application; if the result is ambiguous, 
then, in order to refine it, a kind of a dialog is initiated. 

To exemplify these agents, we mention here 

— statistical corrector, 

— syntactical filter, 

— semantic filter and 

— multi-variant analyzer. 
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Spoken commands to 
application 




Formal Dialogue 

representation with a user 

of commands 



Fig. 2. General structure of the Lingua- Voice 



Thus, statistical corrector uses lexical and statistical knowledge entered 
through the special user-friendly environment of the Lingua- Voice system during 
an adjustment of a concrete processor. These data enables us to efficiently re- 
arrange and refine too universal ’’probability estimations” elaborated by a voice 
recognition engine; in some cases, corrector is able to extend the set of hypothe- 
ses by additional words due to a priori defined contextual statistical associations. 
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The work of the above mentioned components can be illustrated by an artihcially 
simplihed example of processing a phrase: 

Take this box and put it on the sixteenth table 

The result of application of corrector can be illustrated by a table (abridged): 



Input words 


ViaVoice output 


Lingua- Voice corrector output 


take 


date, eight 


eight - 22%, eighty - 13%, take - 9%, ... 


this 


this, if 




box 


box, boxes 




and 


then, ten, am 


can - 10%, an - 10%, am - 10%, and - 10%, ... 


put 
1 + 


put, but 




IL 

on 


one, what 


one - 15%, what - 15%, would - 12%, on - 12%, 


the 






sixteenth 




thirteen, sixteen, thirty, sixty 


table 







Pay attention that the Via Voice engine has not detected the words take, and and 
on: they have been reconstructed by the Lingua Voice corrector. 

In our example the syntactical hlter transforms a conhguration {this}-\-{box, boxes} 
into {this} + {box}. 

Since the considered domain restricts the quantity of tables by twenty, semantic hl- 
ter rehnes the subdehnite word {thirteen, sixteen, thirty, sixty} to {thirteen, sixteen}. 

The first results of constructing the Lingua- Voice system are rather encour- 
aging. Along with ’’The robot”, we have in mind an application related to the 
Internet communication. Also, the development of the Lingua- Voice for the Rus- 
sian language is very important and challenging. 
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Abstract. Many different sign languages are in use to communicate, 
especially among the hearing impaired people. Translation of one sign 
language to another is a difficult problem that need efficient solution. 
Processing of signs is different from the processing of words in natural 
languages. Sign languages use shapes and movements to express mean- 
ing. The objective of our research project is to develop a multi-lingual 
machine translation system for sign languages. As a first step towards 
achieving this objective we analyzed three sign languages. This paper 
outlines the current research results. 



1 Introduction 

Many different sign languages are in use in many different parts of the world. 
People who are using different sign languages communicate with the help of 
a translator. People with no hearing disability and unfamilier with the sign 
language may need interpreters of sign languages to communicate with hearing 
impaired people. With the recent advances in communication and transportation 
technologies, there is an increasing demand for such interpreters and translators 
for the disabled. The problem of translation of sign languages can be eliminated 
by using a universally accepted standard sign language. Developing a machine 
translation system for sign languages is another solution. This paper presents 
the results of the later approach. 



1.1 Sign Languages 

Sign Language (SL) is one of the methods used by the hearing impaired people to 
communicate with others. SL is not unique. The formation of a SL is influenced 
by the environment, customs, regions of a country and the natural language used 
in that country. Thus, the signs can be different in from one SL to another. Signs 
express meaning through shapes and movements. This way of communication is 
different from the words and sentences used in natural languages. It is observed 
that different sign languages share common signs between them. For example, 
the signs for victory and failure are the same in any SL [1]. A sign can have a 
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different meaning in a different SL; for example, the promise sign in Japan is the 
same as the friend sign in Sri Lanka. 

There were attempts to develop a standard universal sign language for all. 
However, these attempts were not successful enough to develop a universally 
accepted or truly international SL. In 1971, the international sign form called 
Gestuno was developed by the World Federation of the Deaf [2] . Its vocabulary is 
based on the European SLs and some European countries have adopted Gestuno. 
It is mainly used at international meetings. However, Gestuno is not widely 
accepted in the world for day to day use. Translators perform the much needed 
help to establish communication between different SL users [3]. 

Until the 1960s, SL was not considered to be a language, and it was used 
only for educating for hearing impaired. In the 1980s, the Sign Linguistics was 
born and SL began to be researched from a linguistics point of view [4]. 

The term signs include gestures in its general meaning. When used in SL 
linguistics, the term signs mean the components of SL which are equivalent to 
a word in a natural language. 

Three SLs, American Sign Language (ASL), Sri Lankan Sign Language (SSL) 
and Japanese Sign Language (JSL) are analyzed in this paper. Section 2 of this 
paper discusses the methodology used to compare sign languages. Section 3 
outlines the implementation and experimental results. Section 4 gives the con- 
clusion. 

2 Methodology 

2.1 The Basic Idea 

Analysis of SLs is a basic requirement in developing a SL machine translation 
system. To discover the rules for translating a sign from one SL into another 
enables the development of the translation system. Also, it is necessary to analyze 
the relationships between SLs for developing the system. Some of these rules and 
relationships between SLs will be discussed in this section. 

In natural language processing, the basic unit is a word in analysis. Similarly, 
the analysis of a sign leads to the analysis of a SL. The structure of a sign can be 
defined by morpheme and phoneme. These morphological and the phonological 
analysis are the objects of mainstream research [4]. Following section focuses on 
the phonological analysis. 

2.2 Phonological Analysis 

In phonological analysis, the parameters of a sign are defined. It is considered 
that four parameters correspond to the phoneme of a sign language [4] . William 
C. Stokoe introduced composition parameters, DEZ (designator), TAB (tab- 
ulations) and SIG (signation). DEZ represents hand shapes, TAB represents 
locations on the body and SIG represents hand movements. Battison [4] added 
the fourth one, ORI (the orientation of the palm). These four parameters are 
considered to be the components of a sign. We analyze the signs according to 
these four articulatory parameters. 
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3 Implementation 

3.1 Vocabulary 

Signs common to all three SLs are picked up from existing books [5] [6] [7]. Table 
1 shows the total number of signs for the SLs. 



Table 1. Number of signs 





in books 


selected 


comparable 


verbs 


nouns 


no. 


ASL 


1167 


650 








SSL 


1051 


441 


25 


81 


10 


JSL 


15293 


3941 


(Total:116) 



Selected vocabulary in Table 1 shows the number of signs after eliminating the 
signs representing strong religious meaning, or unique cultural characteristics, 
country names and signs with complex movement. The vocabulary of SSL is the 
smallest of three, and it is picked up as the basic SL. Among selected signs, only 
116 signs are comparable among 3 SLs. They are divided into three categories, 
verbs, nouns and numbers. 



3.2 Computerizing Signs 

For computers to recognize signs, they must be coded for parameters. Figure 1 
shows the code form, mainly divided into two, the right hand and the left hand. 
The right hand part begins from r and the left hand part begins from 1. Same 
components are applied to both hands. 




B 

Fig. 1. Code Form 



By default, the right hand is considered as the preferred hand and the fingers 
are open. The complete code is given in the Lab. report [8]. 
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1. DEZ (Hand shape) 

There are 20 hand shapes and 8 shape aspects. In Figure 1, the columns 1 to 
3 represent a code for DEZ parameter. The first half, a pair of 1 and 2, rep- 
resents a hand shape, and the second half, 3, represents a shape aspect. The 
column 1 shows the number of standing fingers, and the column 2 represents 
the code identity. For example, code 23 means two fingers are standing [2) 
and belongs to third hand shape (S’) (the V shape, the index and the middle 
fingers are standing). 

2. ORI (Orientation of the palm in relationship to the body) 

The columns 4 and 5 represent a code for ORI parameter. There are 6 
orientations of the palm in relation to the body; up, down, front, back, 
inside and outside. The column 5 shows direction of the fingertips. 

3. TAB (Initial location) 

The column 6 shows a code for TAB parameter. Signing space in relation 
to the body is divided horizontally into six positions from above the head to 
below the waist. 

4. SIG (Movement) 

The columns 7 and 8 are for SIG parameter. In column 7, seven hand move- 
ments are identified, one static and six dynamic movements where the oblique 
movements are also includes. Column 8 for representing 11 movement as- 
pects. These were selected carefully according to Stokoe’s classification [4]. 

For an example, according to the above process, the ASL sign for read is 
coded as r233rr42w 1512ur50n. 

3.3 Automatic Code Generator 

The automatic code generator is developed on SunOS 4.1.4. for efficient data 
input. Figure 2 shows the basic screen of generator. The screen is divided into 
two, information part and coding part. At information part, user selects the 
target SL from ASL, SSL or JSL and inputs a meaning of sign from the keyboard. 
Result of coding appears in this part. At coding part, 14 lists correspond to the 
columns of Figure 1. Figure 2 shows only the right hand part. A sign is coded 
by clicking twice in one of the each list. This work makes three code databases 
for each SL, like 

go : rl20ru55w 1120ru55w 

look : r233rr25n 

read : r233rr42w 1512ur50n. 

3.4 Structured Comparison 

Using the database described in section 3.3, coded signs are compared for pa- 
rameters, DEZ, ORI, TAB and SIG on commonality, similarity and difference 
of SLs. In commonality, all numbers or characters of a code are completely the 
same. In similarity, only the first half code is the same. In difference codes are 
completely different. Table 2 shows examples in DEZ parameter. 

Out of all 116 signs are compared, only 50 signs involve both hands. 
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Information part 



Biput SIGN 



Select SL JSL ^^les't'l 

]cmE (Risht) =1 1 1 1 1 1 1 |5<1E5?1 





i Pain Ori. 




1 Finger Ori. 




Initial Pos. | 




1 1 [ 




L 




u Upside 
d Donnuard 

0 Outside 
r Tonard receb 
s Touard signet 




u Upside L 

d Dounuard 

0 Outside 
r Toward recei' 
s Toward signer 


2 Upper face 

3 Lower face ( 

4 ft’ound neck 

5 Front of che: 

6 Under waist 





















01 S_ 2 hape 

11 Thunb 

12 Index 

13 Middle 

14 Ring 

15 Little 

21 L.shepe 

22 Y_st«e 

23 V_sh»e 

24 [i_l_shape 



3 Bend fingers 

4 Bend fingerti 

5 Bend a little 

6 Cross fingers 

7 Touch Thuiib 



' Repeat sene t 
I Drau arc 
Etau circle 
. Cross hand 
Tuist urist 
Finger noven! 
Touch the oW 
I Pick up clotf 
: Change the sf 



Coding part 



[check input ccrrtentl [exit] 



Fig. 2. Automatic code generator 



Table 2. Examples 





Commonality 

(adopt) 


Similarity 

(bread) 


Difference 

(wear) 


ASL 


510 


512 


232 


SSL 


510 


510 


510 


JSL 


510 


515 


110 



4 Experimental Results 

4.1 Commonality 

Figure 3 shows the percentage of commonality in the SLs. In the graph of the 
right hand, the range of values is 10% to 60%. About 10% of the signs are 
common to three SLs in any parameter. DEZ, ORI and SIG parameters show a 
low rate of commonality. TAB is by far the highest about 35%. With respect to 
two SLs, in any SL combination, its rate is higher about 5% to 20% than that 
of three SLs. The graph of the left hand shows a similar tendency to the right 
hand. 



4.2 Similarity 

Figure 4 shows the percentage of similarity between SLs. Commonality value 
also a subset of similarity. In the right hand, 12% of the signs are judged to be 
similar, in all three SLs, in all parameters. DEZ rate is high as well as TAB, 
except between SSL and JSL. However, the rate of commonality between them 
is high. DEZ and ORI rates increase 20% compared to the rate of commonality. 
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DEZ ORI TAB SIG 



parameter 



THE RIGHT HAND 




parc|^eter 



Fig. 3. Classification rate of commonality 



while TAB and SIG rates increase only about 5%. The rate of SIG is the lowest, 
and it is a little different from the commonality rate. 

The left hand is almost the same as the right hand, but in the SIG parameter, 
the value shows high comparatively. 



THE LEFT HAND THE RIGHT HAND 





Fig. 4. Classification rate of similarity 



4.3 Consideration 

For the DEZ parameter, there is a 55% of similarity between AST and JSL, so 
it can be applied to a SL translation system that the hand shapes of sign are 
the same between these two SLs. 
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The commonality and similarity of the ORI parameter indicates a low rate 
in any graph, so an applicable rule and relationships cannot be found. 

All graphs show that the commonality of the TAB parameter has a high rate. 
In any SL, 70% of signs are executed in front of the chest because the hands are 
placed unconsciously in front of chest and visibility is high in this position allows 
receivers to recognize signs clearly. Since the 70% of TAB parameters is all the 
same from the beginning, it is natural to show a high rate of TAB parameter. 
When a sign translates to another SL using a translation system, the location 
in relation to the body will not be changed at a high rate. 

For SIG parameter, the similarity of the left hand shows high value, but other 
graphs show low value and rules cannot be found. 



4.4 The Verb Category 

The above analysis applied to all the signs. Now, the analysis turns to a com- 
parison by parts of speech and the four parameters. One hundred and sixteen 
signs are classified into three categories, verbs, nouns and numbers. Only verbs 
are considered here. The verb category has 25 signs out of which 18 signs use 
the both hands. 



THE LEFT HAND 




parameter 



« » THE RIGHT HAND 




parameter 



Fig. 5. Classification rate of similarity in the verb category 



Figure 5 shows the result of the comparison of verbs by the four parameters. 
It is similar to the results for similarity and commonality, except the similarity 
rate of SIG is higher than the others. The reason for this result is that verbs 
involve movement in natural language and their concepts are almost the same 
in any language. The description of a sign for verb also has a similar concept 
and this is reflected in the similarity of SIG parameter. 
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5 Conclusions 

The relationships between SLs are found by comparing SLs according to param- 
eters and categories. 40% of signs show a common location on the body in three 
SLs. 60% of hand shapes are similar in ASL and JSL. The orientation of the palm 
needs more research to find some rules. The hand movement also needs further 
research, but in the verb category, 25% are similar in the three SLs. Analysis 
of the verb category proves that more rules can be found in the specified cate- 
gories. It is effective to find the rules and relationships between SLs by category. 
Classifying categories correctly, and comparing categories is meaningful. 

There was no special relationship or similarity between any pair of SLs among 
3 SLs investigated. This implies a SL is unique. Each country, U.S., Sri Lanka or 
Japan, has own unique culture and, so a high rate of similarity may not exist. 

The vocabulary of a SL is said over 20,000, and only 0.05% are analyzed in 
this paper. More vocabulary to be investigated for better conclusions. Some sign 
descriptions in books are difficult to interpret, so practical knowledge of 3 SLs 
are needed. Expanding the analysis of other SLs is also necessary. 
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